Google’s MusicLM AI Can Generate Tunes from Text Prompts

Google is introducing a new artificial intelligence app called MusicLM that creates music in any style or genre based on text prompts and can translate a whistled melody or casually hummed snipped into instrument sounds. TechCrunch calls the technology “impressive” but says the Alphabet company “fearing the risks, has no immediate plans to release it,” in recognition of the controversy surrounding AI models trained using copyrighted material. MusicLM was created using a dataset of 280,000 musical hours, resulting in the ability to generate minutes-long songs of “significant complexity.”

This audio corollary to DALL-E is documented in an academic paper that mentions “the many ethical challenges posed by a system like MusicLM, including a tendency to incorporate copyrighted material from training data into the generated songs,” writes TechCrunch.

An experiment “found that about 1 percent of the music the system generated was directly replicated from the songs on which it trained — a threshold apparently high enough to discourage them from releasing MusicLM in its current state.”

Google says it is “publicly releasing a dataset with around 5,500 music-text pairs, which could help when training and evaluating other musical AIs,” according to The Verge.

Google presents examples like “Berlin ’90s techno with a low bass and strong kick” and “enchanting jazz song with a memorable saxophone solo and a solo singer.” TechCrunch says the results “remarkably, sound something like a human artist might compose, albeit not necessarily as inventive or musically cohesive.”

“It also includes interpretations of phrases like ‘futuristic club’ and ‘accordion death metal,’” shares The Verge, noting “MusicLM can even simulate human vocals, and while it seems to get the tone and overall sound of voices right, there’s a quality to them that’s definitely off.”

AI-generated music has a history that dates back over several decades, featuring systems that were credited with compositions that range from pop songs to music in the style of Bach. AI-generated music has also been used to accompany live performances by researchers and experimental artists.

“One recent version uses AI image generation engine StableDiffusion to turn text prompts into spectrograms that are then turned into music,” writes The Verge, noting that the Google researchers who created MusicLM say it “can outperform other systems in terms of its ‘quality and adherence to the caption,’ as well as the fact that it can take in audio and copy the melody.”

As for that last part, The Verge calls it “one of the coolest demos,” letting users input a hummed or whistled tune, then letting you “hear how the model reproduces it as an electronic synth lead, string quartet, guitar solo, etc.”

Related:
After Inking Its OpenAI Deal, Shutterstock Rolls Out a Generative AI Toolkit to Create Images Based on Text Prompts, TechCrunch, 1/25/23

No Comments Yet

You can be the first to comment!

Leave a comment

You must be logged in to post a comment.