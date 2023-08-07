Creating bots that can communicate organically with humans using language has always been a goal of artificial intelligence. Current embodied agents can understand and execute simple commands, but struggle to comprehend the full range of language expressions used in real-world situations, including knowledge transmission and coordination.

Researchers from UC Berkeley have introduced Dynalang, an agent that learns to behave and model the world using language. Dynalang separates the learning of behavior through reinforcement learning from the learning of the world through supervised learning. The agent’s world model receives visual and textual inputs, which are compressed into a latent space. The model is then trained to anticipate future representations based on data gathered online as the agent interacts with the environment. The policy is trained to make decisions that maximize task reward using the latent representation.

By combining language and visual experience, Dynalang learns to comprehend various forms of language, predict future observations, and carry out tasks more efficiently. It outperforms state-of-the-art RL algorithms and task-specific designs in a wide range of domains. The Dynalang framework also allows for unified language production, where an agent’s perception influences its language model.

The contributions of this research include the introduction of Dynalang, which connects language to visual experience through future prediction. It demonstrates the ability of Dynalang to comprehend various forms of language and outperform previous algorithms in different tasks. Additionally, Dynalang opens up possibilities for combining language creation and pretraining in a single model.