Meta now has an AI model that can create music from text prompts

12 Jun 2023

Image: © MMollaretti/Stock.adobe.com

Known as MusicGen, the new Meta AI model is similar to Google’s MusicLM and is based on 20,000 hours of licensed music.

Meta has launched a new open-source AI model called MusicGen that can generate music based on text prompts.

News of the release was first shared on LinkedIn by Facebook research scientist Gabriel Synnaeve and reported on by The Decoder. Synnaeve shared the GitHub link to MusicGen on Saturday (10 June) and called it a “simple and controllable music generation model”.

Built on top of Meta’s EnCodec audio tokenizer, MusicGen can be prompted by both text and melody. This means that it can both generate short pieces of music based on text inputted by a user, as well as complete a melody it is made to hear thanks to its AI transformer model.

According to The Decoder, the team led by Synnaeve used 20,000 hours of licensed music to train MusicGen. This included an internal dataset of 10,000 high-quality music tracks as well as music data from Shutterstock and Pond5.

“Unlike prior work, MusicGen is a single stage transformerLM which uses an efficient token interleaving pattern. Hence [it] eliminates the need for cascading several models (eg hierarchically or upsampling),” Synnaeve explained.

“We release code and pretrained models publicly for open research, reproducibility and for the broader music community to investigate this technology.”

Synnaeve and his team of researchers demonstrated in a paper published with Cornell University how MusicGen can generate high-quality samples while being conditioned by text and melody “allowing better controls over the generated output”.

“We conduct extensive empirical evaluation, considering both automatic and human studies, showing the proposed approach is superior to the evaluated baselines on a standard text-to-music benchmark,” the team wrote.

Music samples and comparisons of MusicGen with rivals such as Google’s MusicLM can be found here.

The rush to find useful applications using generative AI through text, music and video was prompted by the rise in popularity of OpenAI’s ChatGPT chatbot released in November. But it has also raised concerns around copyright infringement and plagiarism in various scenarios.

Just last week, for example, a paper published in Cell Reports Physical Science claimed to have found a new approach to distinguishing academic science writing from humans and ChatGPT with more than 99pc accuracy using ‘off-the-shelf’ machine learning.

10 things you need to know direct to your inbox every weekday. Sign up for the Daily Brief, Silicon Republic’s digest of essential sci-tech news.