The researchers claim that SIMA 2 can perform a range of more complex tasks in virtual worlds, solve certain problems independently and communicate with its users. It can also improve by repeatedly performing more complex tasks and learning through trial and error.
“Games have long been the driving force behind agent research,” Joe Marino, a research scientist at Google DeepMind, said at a press conference this week. He noted that even a simple action in the game, such as lighting a lantern, can involve multiple steps: “It's a really complex set of challenges that you have to solve in order to progress.”
The ultimate goal is to develop the next generation of agents that can follow instructions and perform unlimited tasks in environments more complex than a web browser. In the long term, Google DeepMind wants to use such agents to control real robots. Marino said the skills acquired by SIMA 2, such as navigating the environment, using tools and collaborating with people to solve problems, are important building blocks for future companion robots.
Unlike previous work on game agents such as AlphaZero, which beat a Go grandmaster in 2016, or AlphaStar, which defeat 99.8% of players in ranked human competitions In the 2019 video game StarCraft 2, the idea behind SIMA is to teach an agent to play an open-ended game without preset goals. Instead, the agent learns to follow instructions given to it by humans.
People control SIMA 2 via text chat, talking to it out loud or drawing on the game screen. The agent processes the video game pixels frame by frame and determines what actions it needs to take to complete its tasks.
Like its predecessor, SIMA 2 was trained on footage of people playing eight commercial video games, including No Man's Sky and Goat Simulator 3, as well as three virtual worlds created by the company. The agent has learned to map keyboard and mouse actions.
The researchers say SIMA 2, connected to Gemini, is much better at following instructions (asking questions and providing updates as they are completed) and figuring out on its own how to perform some more complex tasks.
Google DeepMind tested the agent in environments it had never seen before. In one set of experiments, the researchers asked for Genie 3, the company's latest software. world modelto create environments from scratch and added SIMA 2 to them. They found that the agent could navigate there and follow instructions.






