Google DeepMind has unveiled new research highlighting an AI agent that's able to carry out a swath of tasks in 3D games it hasn't seen before. The team has long been experimenting with AI models that can win in the likes of Go and chess, and even learn games without being told their rules. Now, for the first time, according to DeepMind, an AI agent has shown it's able to understand a wide range of gaming worlds and carry out tasks within them based on natural-language instructions.
The researchers teamed up with studios and publishers such as Hello Games (No Man's Sky), Tuxedo Labs (Teardown) and Coffee Stain (Valheim and Goat Simulator 3) to train the Scalable Instructable Multiworld Agent (SIMA) on nine games. The team also used four research environments, including one built in Unity in which agents are instructed to form sculptures using building blocks. This gave SIMA, described as «a generalist AI agent for 3D virtual settings,» a range of environments and settings to learn from, with a variety of graphics styles and perspectives (first- and third-person).
«Each game in SIMA’s portfolio opens up a new interactive world, including a range of skills to learn, from simple navigation and menu use, to mining resources, flying a spaceship or crafting a helmet,» the researchers wrote in a blog post. Learning to follow directions for such tasks in video game worlds could lead to more useful AI agents in any environment, they noted.
Google DeepMindThe researchers recorded humans playing the games and noted the keyboard and mouse inputs used to carry out actions. They used this information to train SIMA, which has «precise image-language mapping and a video model that predicts what will happen next on-screen.» The AI is able to comprehend a range of environments and carry out tasks to accomplish a certain goal.
The researchers say SIMA doesn't need a game's source code or API access — it works on commercial versions of a game. It also needs just two inputs: what's
Read more on engadget.com