David Lewis Talking Tech & Audio

MusicLM from Google – a massive 1st step…but will it have a happy ending?

Yesterday we may just have witnessed something pretty remarkable…

MusicLM from Google

It was coming

MusicLM – is it simply another step on the AI journey, or is it another brick in the creatives wall?

Being an Apple-focused creator, I only had one eye on Google’s annual IO event yesterday. While the headlines may have been dominated by hardware such as Pixel Fold & Google Pixel Watch, they inevitably handed over a fair amount of the conference to Artificial Intelligence (AI), and in particular, some new Generative AI (GAI) tools that could affect forever the way we make and enjoy music – forever.

A lot of time was rightly handed over to AI yesterday by Google CEO Sundar Pichai. But my interest was piqued when Pichai turned his attention to Google MusicLM.

So much so soon

It seems as if it was only a blink of an eye ago that Dall.E was the new kid on the AI block. Then came along the game-changing Chat-GPT and others like Mid-Journey then swiftly followed.

Like many of us, my life has not been untouched by AI – almost every day I’ll use it in one form or another – Adobe’s Sensei, Siri, or more recently Phind as well which is a great creators tool by the way.

We seemed to have jumped from AI to generative AI quickly and yet, it is already frighteningly good. To think this is as bad as it will ever be is both sobering and exciting.

Simply put MusicLM can turn simple typed prompts into music from any genre;

MusicLM casts the process of conditional music generation as a hierarchical sequence-to-sequence modelling task, and it generates music at 24 kHz that remains consistent over several minutes.

MusicLM is not the first GAI platform for music generation. Others such as Riffusion, Dance Diffusion and Open AI’s Jukebox have all come before, but have been limited. Where MusicLM has the nod is that in time, it will be able to create high-fidelity, complex compositions.

The backstory

The reason that it will be so competent is partly because it has the weight of Google behind it but also because of the data behind it.

Currently, it is in beta form only and Google has stressed there are “no plans to release models at this point,” citing the need for more work. In the paper they published where they discussed the capabilities of MusicLM Google said it has been trained on a dataset of 280,000 hours of music. It is that which will give it the ability to eventually generate coherent songs for the prompts offered.

I have created a Dropbox link for you here to experience what is currently capable of. The numbers against the prompts below relate to the tracks in the folder;

  1. “The main soundtrack of an arcade game. It is fast-paced and upbeat, with a catchy electric guitar riff. The music is repetitive and easy to remember but with unexpected sounds, like cymbal crashes or drum rolls.”
  2. “Epic soundtrack using orchestral instruments. The piece builds tension and creates a sense of urgency. An a cappella chorus sings in unison, it creates a sense of power and strength.”
  3. “This is an r&b/hip-hop music piece. There is a male vocal rapping and a female vocal singing in a rap-like manner. The beat is comprised of a piano playing the chords of the tune with an electronic drum backing. The atmosphere of the piece is playful and energetic. This piece could be used in the soundtrack of a high school drama movie/TV show. It could also be played at birthday parties or beach parties.”

Only the start

I’ll be the first to say that the samples are still relatively simple and basic – but this is pretty much day one of MusicLM!

GAI will only ever be as good as the prompt we can give it. Sad to say, but in this instance, we are the weakest link. Where MusicLM already shines is when it’s given a long, vague, drawn-out description yet it still manages to throw out a number that has taken on board the finer nuances that you’ve asked for – even down to melodies, genres and solos.

You’ll notice when you start to use MusicLM that you can give trophies if it’s done a good job for you. This is critical to the learning process that it’ll have to go through to become even better. Although it comes baked in with the massive dataset already mentioned, real-life usage still can’t be beaten – at least we still do have a purpose then in this AI-driven future!

Although MusicLM has been given the tools to create and generate lyrics, at the moment they are poor. They sound like some form of English but are almost unusable – but you just know it’ll get better and quickly!

Where’s the money?

Music LM won’t mimic an artist directly, but of course, you can ask it to generate a track in the style of…and that will become the grey area.

Part of the reason that Google won’t be making MusicLM generally available right now is due to the tendency to incorporate copyrighted material from training data into the generated songs.

As it stands Google’s research team established that about 1% of the music generated directly took the melodies or riffs from the data and music it has been trained on. Google decided even at 1% that was still too high for it to be released in its current state.

“We acknowledge the risk of potential misappropriation of creative content associated with the use case,” the co-authors of the paper wrote. “We strongly emphasise the need for more future work in tackling these risks associated to music generation.”

Oddly, this new dawn is close to home. My daughter works in the industry being the International Product Manager for Marketing at BMG. She represents musicians whose livelihoods rely on being fairly paid for their content – and I guess to some degree her future and livelihood depends on that as well.

In the future if you can ask for a rock song to be generated in the style of (insert band’s name here), who would own the royalties? Does anyone get paid? How do creatives end up making a living?

There is already some history building in these murky areas. As long ago as 2020 Jay-Z’s label dumped a copyright strike against the YouTube channel Vocal Synthesis. They claimed the channel had used AI to create Jay-Z covers of songs like Billy Joel’s We Didn’t Start the Fire. Although YouTube initially took down the videos, they have since been reinstated saying the takedown requests were ’ incomplete’.

The Music Publishers Association’s point of view is that MusicLm and similar AI music generators violate music copyright by creating “tapestries of coherent audio from the works they ingest in training”.

The counter to that would be though that music generated by an AI system would be “considered a derivative work, in which case only the original elements would be protected by copyright (Andy Baio)”.

I would guess that music publishing companies such as BMG, Universal and Sony will ask for the data on what songs have been used to train the GAI models. How the decision is made on what counts as fair remuneration for the work and content I have no idea.

The official stance from Google is;

We believe responsible innovation doesn’t happen in isolation. We’ve been working with musicians like Dan Deacon and hosting workshops to see how this technology can empower the creative process.

Ready to use

The songs that MusicLm generates are available to download – as I have with the sample tracks I have linked for you to listen to.

You’ll notice they are currently very short duration – only 30 seconds or so, and are low audio quality as well (mono, 24kHz and 16-bit files), but don’t forget, this is still in beta form…it will get better!

There are a couple of videos that I came across in researching this article that I thought may interest you. First from Music Radio Creative and then Google’s video. And if you want to try your luck at signing up for the beta trial, follow this link.

Wrapping up

MusicLM is or will be another tool in the creator’s belt. How it is best policed, monitored and used only time will tell.

Like it or not generative artificial intelligence is happening and trying to bury your head in the sand isn’t going to cut it. The age-old adage of better the devil you know comes to mind.

Not only will the music industry and musicians be affected by this advance, but so to content creators of all kinds. Companies that create jingles and trailers and music beds, VO talent and eventually radio & podcast hosts too will all be in the firing line.

For now, we still win out, with our warmth, humour and ability to react – but as I’ve kept on saying today…give it time.

Getting involved…

Guess what – if you look forward to my articles & blogs landing each day, you can help that happen! By clicking via this link, you can join Medium, and get my blogs every day, the moment I publish them. And, you can even get email notifications about them too.

Before you go – join my mailing list here.

Are you subscribed to Medium yet?

I am only one of a whole host of writers here on Medium, the premium blogging site. It is such good value, and you can join below.

Leave a Reply

Your email address will not be published. Required fields are marked *