Frustrated by the ending of Game of Thrones? Let's redo it our way.
I've experimented with creating an extremely simple video game, inspired by the books I read decades ago—those "choose-your-own-adventure" stories where you're the hero. The difference here is that everything is generated dynamically at each step:
- the story
- the choices
- the voiceover
- the illustrations
- and the music (which I removed, for reasons I'll explain later).
I saw multiple videos on YouTube and even on LinkedIn where people claimed they were able to create complex software very simply. Video games are clearly part of this phenomenon since they’re a blend of art and engineering at their finest (not to mention the market competition, with thousands of new games released nearly every day).
What I tried to show here is that, nowadays, you can create a story that never ends.
If it’s so easy to create simple games, then what’s the real value? For me, it goes back to what Elon Musk said: learning should be as close as possible to video games. It turns out that if we combine AI’s capabilities with some brilliant minds, we could teach many things as if they were games.
Of course, we have to remember that real learning involves doing and practicing, and just clicking buttons—sometimes randomly—has limited educational value. The challenge is how to export the ability to practice while still “gaming,” beyond the screen.
Returning to my experiment, I learned something interesting: I used only OpenAI's tools. There's an orchestrator that sets the situation, and then different calls are made to generate the various assets. One component, however, was a complete failure: music creation. My initial assumption was that ChatGPT/OpenAI’s APIs could at least transcribe famous pieces into MIDI files (which are easy to manipulate and have been a standard for decades). It turned out to be a total disaster—even simple melodies weren’t reproduced accurately—so I gave up on generating original compositions.
Another issue is that creating consistent characters is a real challenge. The tips I found here and there were either very complex or outdated, so I dropped the idea. Also, you do need some engineering know-how to put it all together. Each transition is quite slow since it’s dynamically generated. You could add some buffer to “pre-generate” these transitions, but that requires extra implementation steps and increases API costs.
So, what’s next? Video games, like any software, are never truly complete; there are always improvements to be made. The answer depends heavily on your use case. It takes strong creative direction—despite the heavy lifting done by GenAI—and having everything generated might not make sense unless you’re willing to break the bank.
This another example of a paradigm shift in how we interact with content overall, similar to ChatBots or Retrieval-Augmented Generation (RAG), where people will increasingly skip reading documents front to back and instead query them directly for what they need. This shift opens up new possibilities, from revolutionizing storytelling to creating entirely new content consumption and learning formats. The key is striking the right balance between AI-generated and human-curated elements, ensuring quality without overwhelming costs.
Where do you see this experiment going? Would you focus on refining the gaming aspect, exploring education applications, or evolving it into a collaborative storytelling platform?