Home > Information > News
#News ·2025-01-08
Google recently formed a new team dedicated to developing AI Models that can be used to simulate the physical World, known as "World Models."
Tim Brooks was originally an executive at OpenAI and one of the lead developers of the video generation tool Sora. In October last year, he joined Google DeepMind as one of the new leaders. The AI model team that simulates the physical world is part of Google DeepMind.
Brooks said: "DeepMind has big plans to develop huge generative models that can simulate the physical world. When I am hired, I will be part of a new team working to help the company achieve its new mission."
The new team will work with the Google Gemini, Veo, and Genie teams to solve key new problems. Veo is Google's own video generation model; Gemini is the equivalent of Google's flagship AI model, which can analyze images, generate text, and simulate games and 3D environments. The latest version of Genie was unveiled in December, and it already generates a huge variety of playable 3D worlds.
The new team job Posting reads: "We believe that scaling AI training in video, multimodal is a critical path to AGI."
AGI is the common goal of the top AI companies, and it refers to the fact that AI can perform any task like a human.
People in the AI industry believe that world models are the next big thing in AI. The term "world model" is actually derived from the human mind, which evolved naturally.
The human brain abstracts some representations from the senses to deepen its understanding of the world around it, forming so-called "models" according to which the brain makes predictions and then influences people's perception of the world.
When a baseball player plays a ball, it takes only a few milliseconds to determine how to hit the ball, which takes less time than a video signal reaches the brain. The reason humans can hit a baseball at 100 kilometers per hour is mainly because we can intuit where the ball is going.
Some scientists believe that humans' superior intelligence is mainly due to their ability to reason subconsciously, based on models of the world.
Once technological breakthroughs are made, world models will enable multiple domains, such as visual reasoning, simulation, embedded agent planning, and real-time entertainment interaction.
According to the description, the new team will develop a real-time interactive generation tool based on a model previously developed by Google, which will consider how to integrate with existing multimodal models, such as Gemini.
Many startups and tech giants are working hard to develop World models, such as Li Feifei's World Labs, Israel's Decart, and Odyssey. It is believed that once the world model is successful, interactive media content can be created, such as games, movies, and realistic simulation environments can be built for robots.
The creative community is divided over this new technology. Activision Blizzard, for example, aggressively moved into AI tools to improve productivity, resulting in layoffs of some employees. According to a recent report by the Animation Guild, more than 100,000 film, television and animation jobs will be destroyed in the United States by 2026 due to the application of AI.
Odyssey is relatively new to the world of "world model" development, and it claims to be developing with creative professionals and has no plans to replace them. Will Google's physical world simulation AI replace creatives? It remains to be seen.
In terms of copyright, there are also obstacles to developing world models. Some world models are trained on video game footage without copyright, which can lead to disputes.
Google, which owns YouTube, has permission to train the model on YouTube videos, though it doesn't yet know which ones it will use.
Of course, in addition to these problems, there are many technical problems that the world model has not solved, even Google, is not a small distance from success.
Like all AI models, the world model also suffers from "illusion". If you train a model with European city data, where the weather is always sunny, you may not understand snowy Asian cities. Without sufficient data, models cannot understand the world deeply.
Runway CEO Cristobal Valenzuela recently said that capturing the world's "inhabitants" (such as animals and people) with precision is a huge challenge due to data and engineering issues. The model needs to generate a consistent map of the environment, navigate and interact with the environment.
Although there are huge challenges, if the problem is solved, world models can make AI better connected to the real world, and not only will virtual world generation tools make breakthroughs, but robots and AI decision-making will also make great strides.
2025-02-17
2025-02-14
2025-02-13
13004184443
Room 607, 6th Floor, Building 9, Hongjing Xinhuiyuan, Qingpu District, Shanghai
gcfai@dongfangyuzhe.com
WeChat official account
friend link
13004184443
立即获取方案或咨询top