The new technology of AI-generated music of Google is again here. The AI revolution of 2022 saw the rise of advanced generative AI models such as ChatGPT, DALL-E 2, and others capable of producing impressive text or images in response to user prompts. However, only some generative AI models gained attention during this time. Several companies also trained AI models to generate music responding to text, audio, or image prompts. For instance, OpenAI, the renowned research firm behind ChatGPT and DALL-E 2, had already released an AI music generator called “Jukebox” back in 2020.
Despite these advancements, AI-generated music has not been as enthusiastically embraced as its text and image counterparts. One of the main reasons for this is that the outputs of AI-generated music are often low-fidelity, simplistic, and lacking in traditional song structures, such as repeating choruses. This has been a challenge for AI music generators to overcome to gain wider acceptance among musicians and music listeners.
One notable example of AI-generated music is Google’s AI music generator, likened to ChatGPT for audio. This AI system, similar to ChatGPT, utilizes neural networks to generate music responding to various prompts. For instance, users can provide text, audio, or image prompts, and the AI music generator will produce a piece of music based on these inputs. While it is still a relatively new technology, Google’s AI music generator has shown promising results in terms of its ability to compose music that is more sophisticated and enjoyable to listen to than earlier AI-generated music.
Comparing AI Music Generators, Unveiling the Soundscapes of MusicLM, Mubert, and Riffusion
The potential applications of AI-generated music are vast. For instance, it could be used in the entertainment industry to compose original soundtracks for movies, video games, and commercials. It could also be utilized in the music production process, assisting musicians and composers in generating new ideas and exploring different styles and genres. It could also be used as a tool for music education, helping aspiring musicians learn and practice their skills.
Despite the progress made in AI-generated music, challenges remain to be addressed. One of the main challenges is achieving a higher level of fidelity in the generated music, including more complex melodies, harmonies, and arrangements characteristic of human-composed music. Another challenge is ensuring that AI-generated music respects copyright laws and does not infringe on the rights of original music creators. Moreover, ethical considerations should be considered, such as the potential impact on human musicians and composers regarding job displacement and the role of human creativity in the music industry.
During the evaluation stage, Google compared MusicLM, Mubert, and Riffusion, three different text-to-music AI systems. The evaluation involved quantitative metrics in assessing the audio quality and adherence to a given text description for each generated music clip. Additionally, human evaluators were presented with reports from MusicCaps, along with two audio clips.Â
According to a paper shared by Google on the preprint server arXiv, MusicLM outperformed the other AI systems in all evaluated aspects. This suggests that MusicLM demonstrated superior performance in terms of audio quality and alignment with the provided text descriptions compared to Mubert and Riffusion, as determined by the evaluation process conducted by Google.
 MusicLM of Google and the Quest for Quality and Copyright Protection
Google’s AI music generator has made strides in producing audio that sounds closer to human-written music. However, it still needs to catch up in replicating traditional song structures, and its vocals are of poor quality with unintelligible lyrics.
One challenge that Google is working to overcome before releasing the MusicLM to the public is that about 1% of its output can be approximately matched to audio in its training data. This presents a risk of potential misappropriation of creative content associated with AI-generated music. According to Google, future improvements on the system could focus on addressing these issues and improving the overall quality of the audio.
The researchers wrote, “We acknowledge the risk of potential misappropriation of creative content associated to the use case … We strongly emphasize the need for more future work in tackling these risks associated to music generation.”