Yesterday, Xbox made a significant reveal with the introduction of Muse—a cutting-edge generative AI model aimed at sparking creativity in game design. This release was accompanied by an accessible article on Nature.com, a detailed blog post, and a YouTube video explaining its nuances. If you’re scratching your head over “gameplay ideation,” Microsoft describes it as crafting “game visuals, controller actions, or both.” It sounds revolutionary, but in reality, its practical applications remain narrow, far from revolutionizing the traditional game development pipeline.
The numbers behind Muse are intriguing. For instance, training took place on a large scale using H100 GPUs. To generate just one additional second of actual gameplay, Muse needed around a million training updates to stretch that into nine seconds of simulated gameplay that mirrors the game engine’s accuracy. Most of the training data was drawn from current multiplayer sessions.
Interestingly, this task wasn’t executed using just a single PC. Microsoft needed to employ a cluster of 100 Nvidia H100 GPUs. This approach, though costly and power-hungry, only achieved a modest resolution output of 300×180 pixels for those extra nine seconds of simulated gameplay.
The most captivating demonstration of Muse in action was its ability to duplicate existing props and enemies in the game environment, convincingly replicating their functions. But given the substantial hardware costs and power requirements, one might wonder why not just use conventional development tools to create these elements instead?
While it’s noteworthy that Muse managed to maintain object permanence and echo the game’s original behavior, its practical applications seem a bit extravagant compared to the established, and efficient, traditional game development methods.
Muse’s future versions might pack more impressive tricks, but for now, it feels like just another project in the lineup of endeavors trying to create complete gameplay via AI. Though the engine accuracy and object permanence are commendable, relying solely on this method to develop, test, or play video games seems unnecessarily complex. After spending hours diving into the available material, I still find it puzzling why anyone would prefer this approach.