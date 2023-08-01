Google’s DeepMind has introduced Robotics Transformer 2 (RT-2), an innovative vision-language-action (VLA) model that allows robots to perform new tasks without specific training. Similar to how language models learn from web-scale data, RT-2 utilizes text and images from the web to comprehend real-world concepts and translate that understanding into generalized instructions for robot actions. This technology could potentially lead to adaptable robots that can perform various tasks in different situations and environments with minimal training.

RT-2 is an upgrade from its predecessor, RT-1, which trained on 130,000 demonstrations and enabled Everyday Robots to achieve a 97% success rate in performing over 700 tasks. By combining robotic data from RT-1 with web datasets, DeepMind trained RT-2. The key differentiator of RT-2 is that it doesn’t require a large volume of data points to make a robot work. Unlike previous models, it learns from a small amount of robotic data to perform complex reasoning, and it can transfer the acquired knowledge to direct robot actions, even for tasks it has never encountered before.

The improved generalization capabilities of RT-2 allow it to interpret new commands and respond to user instructions through rudimentary reasoning about object categories or high-level descriptions. Vincent Vanhoucke, head of robotics at Google DeepMind, explained that with RT-2, robots can perform actions without explicit training. For example, RT-2 can identify and throw away trash without being explicitly trained to do so.

In internal tests, RT-2 performed as well as RT-1 for familiar tasks. However, for unfamiliar scenarios, its performance doubled from RT-1’s 32% to 62%. The potential applications of advanced vision-language-action models like RT-2 include context-aware robots that can reason, problem-solve, and interpret information to perform a wide range of actions in real-world situations.

The segment of AI-driven robotics is expected to experience significant growth, with an estimated market value projected to increase from $6.9 billion in 2021 to $35.3 billion in 2026, representing a compound annual growth rate (CAGR) of 38.6%.