In a surprising move, Apple and researchers from Columbia University have collaborated to release an open-source multimodal large language model named Ferret. Unlike Apple’s usual guarded approach, the company has made the model’s code and weights accessible to researchers, although its usage remains restricted to research purposes only. Ferret is a groundbreaking model that possesses the ability to analyze specific elements within an image and respond to queries accordingly.

For instance, if a user selects a dog within an image and inquires about its species, Ferret can accurately provide the answer. Furthermore, the model can also contextualize other objects within the image to understand the dog’s actions. This capability showcases the potential of Ferret in aiding researchers across various fields, from image recognition to natural language processing.

Ferret comes in two different sizes: a 7-billion parameter model and a 13-billion parameter model. The smaller version is likely optimized for iOS devices, taking into account the limitations of running on mobile hardware. This aligns with Apple’s recent efforts to incorporate more AI components into their devices and effectively utilize them.

To further support the research community, Apple has introduced Ferret Bench, a benchmarking tool specifically designed for evaluating the model’s efficiency and flexibility in various use-cases. This tool will enable researchers to analyze and refine their experiments utilizing Ferret.

With the release of Ferret, Apple demonstrates its commitment to driving innovation and advancing artificial intelligence. By providing access to this powerful language model, Apple encourages collaboration and research within the AI community, fueling new breakthroughs and discoveries.

