CMU Trends

Trends: AI – The next revolution in music?

By | Published on Monday 19 February 2018

AI White Paper

Music business conference MIDEM this week publishes a brand new white paper from our consultancy unit CMU Insights reviewing the potential impact various AI technologies will have on the music industry in the next decade. As a preview, CMU Trends presents some highlights providing a beginner’s guide to the three technologies covered: audio recognition, automated messaging and music composition AI.

The history of the music industry is basically a story about how a sequence of new technologies respectively transformed the way music is made, performed, recorded, distributed and consumed.

Each new chapter begins as a new technology takes hold and kickstarts a revolution. Though each time that happens, we know that another equally revolutionary technology isn’t far way.

Each innovation results in a new chapter. And those new chapters seem to come along with increased frequency as the years go by.

But not every technological innovation results in a revolution. One of the challenges for the music industry is working out which new technologies will take hold and are therefore worth investing in. Because that investment can be considerable when you take into account the time, the money and the creative energy employed. So what technologies will be causing a revolution in the music business over the next decade?

Of particular note are those technologies that can be loosely grouped under the banner ‘artificial intelligence’. Tools that employ big data, algorithms and machine learning to change the way music is monitored, marketed and made.

Given that the conversation around AI is becoming ever more vocal – as academics, journalists, politicians and entrepreneurs consider how these new technologies could change the way our society works, the way our economy operates, and the very essence of what it means to be human – it is surprisingly hard to define what ‘artificial intelligence’ actually means.

The CMU Insights white paper for MIDEM considers three technologies of interest to the music community that can be loosely placed under the umbrella of AI.

Depending on your definition of ‘artificial intelligence’, these technologies are either already examples of AI in action, or they are prototypes that will ultimately embrace machine learning to become ever more sophisticated, as AI at large becomes more efficient and more affordable.

Those three technologies are:

• Audio-recognition tools that recognise music being played online, on air, or in the live environment.

• Automated messaging platforms that allow artists to communicate and interact with fans through messaging apps.

• Automated creation tools that can automatically create music or video according to prescribed criteria or by scanning other material.

Audio-recognition technology isn’t particularly new. Perhaps the most famous brand in this space, Shazam, launched nearly two decades ago in 1999. However, technology of this kind is becoming ever more sophisticated, and is being employed in an increasing number of ways. To that end, it definitely feels like the full potential of audio-recognition is yet to be fully realised.

Platforms offering audio-recognition are usually based around what are referred to as ‘digital acoustic fingerprints’, or some variation of that term. The platform creates a ‘condensed digital summary’ of each piece of audio it is exposed to. That ‘condensed digital summary’ is unique to that track, hence ‘fingerprint’. Metadata is then attached to each fingerprint to identify the audio and provide other key information about it.

Once a database of fingerprints has been built, when the audio-recognition platform is re-exposed to a piece of audio it should be able to identify which fingerprint the track is associated to. It can then deliver the accompanying metadata. In the case of a consumer-facing app like Shazam, this means telling the user the name of the track they are currently listening to.

From a technical perspective, advances in the audio-recognition domain include the ability to identify a track quicker, which is to say from a smaller sample of the recording being identified, and being able to ID a track oblivious of sound quality and background noise, or where the track has been slightly altered in some way.

Then there is the separate challenge of recognising songs rather than specific tracks, so that a platform can identify new and live versions of songs as well as officially released recordings.

From a commercial perspective, although consumer-facing products like Shazam are best known, actually the B2B employment of this technology is arguably more interesting, and where there is still much room for growth.

This is primarily about using audio-recognition technology to identify what music is being played in a variety of circumstances where traditionally it was hard to accurately monitor usage. This information is valuable both as marketing data and for ensuring the accurate distribution of royalties.

The latter employment of audio-recognition technology is most important for user-upload platforms and collective management organisations. Both have a need to more accurately monitor what music is being used by a large number of unsophisticated users every minute of the day across the globe.

In the case of user-upload platforms, the technology is required so that they can give rights owners some control over their content, ensuring they meet their obligations under copyright law and/or their licensing agreements. For collective management organisations, so that they can more accurately distribute royalties to their members when music is played or performed in public.

Some of these organisations have built their own proprietary audio-recognition platforms, like YouTube’s Content ID and Facebook’s Rights Manager, while others rely on the technology of service provider businesses.

In the white paper we talk to three innovators in this domain: WARM, DJ Monitor and Dubset. 

Despite the rise of social media as key marketing and communication channels for artists in the last fifteen years, music marketers have throughout stressed the importance of gathering the email addresses of fans.

Although an artist will communicate more frequently through social media, email is more direct, can be more easily personalised and commercialised, and provides particularly powerful fan data.

However, over the years consumers – and especially younger consumers – have become less engaged with email. Younger fans are increasingly likely to see email as a work-based channel of communication, and not a place to connect with artists and brands.

That latter kind of conversation has shifted over to the messaging apps like Facebook Messenger and WhatsApp. In recent years, we have also started to see conversations that would have previously occurred on the social networks shifting over to the messaging platforms.

All of which creates a need for artists to start engaging with their fans through the messaging apps.

However, that creates a challenge because these apps have been primarily created for direct peer-to-peer conversations between friends. Realistically artists, and especially established artists, don’t have the time or the inclination to engage in conversations of this kind. That doesn’t mean they can’t engage fans through these platforms.

Hence we are starting to see the emergence of new technologies that employ data and algorithms to allow artists and brands to connect with fans and consumers via messaging apps.

Sometimes referred to as chatbots, these technologies facilitate automated direct connections with fans, though quite what that means can vary. It may differ according to what the technology is capable of, or to what the artist or brand feels is appropriate.

In the white paper we talk to one of the innovators in this domain: POP.

Beyond monitoring and marketing music, what about making it? The third area of AI technologies of interest are automated creation tools: platforms that create original content according to pre-set criteria.

From a music perspective, the technologies to watch are the music composition platforms, which can create original pieces of music quickly and cheaply.

This is an area that is advancing rapidly, though to date these technologies are mainly being used to create original soundtracks for other media, such as videos, podcasts and games. Therefore, the machines are now competing with humans in the production music domain.

However, the big question is, as these technologies become ever more sophisticated, could they also start to write music to be commercially released, meaning that the machines start to compete with humans in more mainstream songwriting? Opinion is divided as to quite how far AI-created music could go.

In the white paper we talk to two of the innovators in this domain: Jukedeck and Rotor. 

You can download the white papers produced by CMU Insights for MIDEM from its website – info at

READ MORE ABOUT: | | | | |