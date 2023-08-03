Meta has announced the open-sourcing of AudioCraft, a suite of generative AI tools designed for creating music and audio based on text prompts. Content creators can use these tools to input simple text descriptions and generate complex audio landscapes, compose melodies, or simulate virtual orchestras.

AudioCraft comprises three core components. The first, AudioGen, generates various audio effects and soundscapes. For example, it can create sounds like a dog barking, a car horn honking, or footsteps on a wooden floor. The second component, MusicGen, creates musical compositions and melodies based on descriptions. It can generate songs of different genres from scratch, according to the provided criteria. Lastly, EnCodec is a neural network-based audio compression codec, which has been improved to allow for higher quality music generation with fewer artifacts.

While generative AI models for text and still images have gained significant attention, development in generative audio tools has lagged behind. Meta aims to address this gap by open-sourcing AudioCraft under the MIT License, providing accessible tools for audio and musical experimentation.

Meta believes that their tools will contribute to the broader community and help advance the state of the art in audio generation. They emphasize that the models are available for research purposes and hope that researchers and practitioners will be able to train their own models with their own datasets.

Notably, Meta states that MusicGen was trained on 20,000 hours of music owned by Meta or specifically licensed for this purpose. This move is likely an attempt to address concerns about undisclosed and potentially unethical training material, as seen in other generative AI models.

OpenAI, Google, and other research teams have previously explored AI-powered audio and music generation. These experiments, while not as widely recognized as image synthesis models, require complex modeling of signals and patterns at varying scales. Generating coherent and high-fidelity music is particularly challenging, as it involves capturing local and long-range patterns as well as expressive nuances.

Meta’s decision to open-source AudioCraft will likely lead to integration by open-source developers in their projects, potentially resulting in user-friendly generative audio tools in the future. The model weights and code for the AudioCraft tools can be found on GitHub for those interested in exploring them further.