CMU Trends Digital

Trends: Reviewing the potential of the new creative tech – the story so far

By | Published on Wednesday 30 November 2016

machine learning

The story of the music industry – and certainly the recorded music industry – is one of regular technological leaps that change the music-making process, the consumer experience and the music rights business, meaning just as everyone gets used to one set-up, something comes along that changes everything.

Most people would agree that those regular leaps are now occurring much more frequently, the rise of the web and other net-connected devices instigating and enabling a plethora of technological innovations which even the earliest of adopters – whether creators, consumers or business leaders – have struggled to keep up with.

It also means that the music industry is constantly having to experiment with new platforms, and then build businesses around those that gain traction, while at the same time still supporting legacy services and formats that – while probably in terminal decline – still command a sizeable market share.

This is all rather challenging, to say the least. Though, if a challenge is an opportunity, then there is an upside to all of this. And if you choose to adopt that optimistic viewpoint, a bunch of rapidly evolving new technologies mean there are plenty of opportunities ahead.

As Music 4.5 puts the spotlight on ‘The New Creative Tech’, let’s review the technologies about to have a significant impact on the music industry.

New technologies are often accompanied by complicated terminology, often with jargon that actually covers a wide variety of quite distinct products. But we’ll put the technologies of interest into two loose groups – artificial intelligence and enhanced consumer experience – while acknowledging the two sets inevitably overall.

‘AI’ isn’t a new term, of course, and it’s a field of computer science that has been evolving for a long time. Though quite what is meant by ‘artificial intelligence’, and therefore what specific technologies fall under this banner, is debatable.

As Dr Mick Grierson, Director of Creative Computing at Goldsmiths College says: “‘Artificial intelligence’ is perhaps not a very useful term, mainly because it’s not easy to define what anybody might mean by ‘intelligence’. There are many definitions, but it’s fairly easy to argue that nobody is quite sure what intelligence is. This is further complicated by the idea that artificial intelligence is somehow related to human-like behaviour or consciousness. This is when it gets really messy, because people are even less sure what consciousness is!”

“And even if they did know, the benefit of machines having human-like consciousness and behaviour is probably not as useful as you might think”, he continues. “For example, if we assume that an ‘artificial intelligence’ is something that can think and solve problems the way a human does, you can quite easily demonstrate this might not be very desirable. Humans often make bad decisions, we’re pretty easy to fool, and we make mistakes all the time, often because we always think we’re right”.

In terms of what people most commonly mean by AI today, he explains: “There are things that many living things do – often that they are not aware of – that are useful for solving problems, like unconsciously remembering patterns from everyday experience. By making models of how we think this might work, some people have created algorithms that learn to solve similar problems – such as to spot certain objects or sounds in a variety of complex situations”.

“A better term for the ability of machines to solve problems by learning from data is ‘machine learning’”, he concludes. “It’s a far more accurate term for understanding what’s happening right now. The algorithms that we create through machine learning are ‘problem solvers’ for specific questions that we are attempting to teach them about”.

From a music perspective, there are a number of practical uses of AI, the most obvious being: audio-identification technologies, of which Shazam and YouTube’s Content ID are perhaps best known; so called chatbots which are starting to be used to automate fan communication; and machines actually creating music.

Martin Gould’s Sonalytic is working on the former. “Everyone remembers the ‘wow’ moment of when they first saw Shazam”, he remarks. “Even today, I still think that this is an amazing technology for people to have at their fingertips. But fingerprinting technologies have been around for a very long time now, and – honestly – one of the things that attracted us to working on this problem was the feeling that there hasn’t been much progress in recent years. Given that advances in machine learning have enabled computers to create detailed labels for photos, and even to drive cars, why hasn’t there been similar progress in the field of audio-identification?”

So how is Sonalytic innovating? “Sonalytic analyses music at the microscopic level of its constituent parts, to extract detailed fingerprints of the individual sounds. This means that we can identify not only whole songs, but also stems, loops and samples within them. For example, we can identify when a derivative musical work uses the vocal line from another song – or even a very short snippet from it. We are also strongly robust to a wide range of audio obfuscations, such as changes in pitch and tempo, filtering, EQing, distortion, and many more”.

He goes on: “We are able to do all of this thanks to our cutting edge machine learning engine, which sits at the heart of our technology. By feeding this engine more and more training data, we are able to extract increasingly powerful fingerprints. By contrast, most existing audio fingerprinting techniques follow a fixed fingerprint-extraction recipe, and therefore don’t provide the power and flexibility of our approach”.

New developments in audio-recognition technology will likely make consumer-facing Shazam-type set-ups more and more sophisticated. Though, for the music industry, it is possibly B2B implementations that are most important, which is to say Content ID type systems that get ever better at spotting audio as it is uploaded to the net, partly for anti-piracy work, but even more so to enable better micro-licensing.

Gould says: “I’m very confident that the recent explosion in popularity of user-generated content isn’t going to slow anytime soon, and it goes without saying that audio-identification technologies will remain at the heart of enabling fair, effective and efficient monetisation, especially given the staggering volumes of uploads involved”.

Elsewhere, Paul Crick from IBM reckons that the chatbots slowly being employed by artists and music companies are the start of an enhanced direct-to-fan experience. Noting that the technology businesses that thrive are those that solve a real problem, he says: “Looking through a music lens, the opportunity exists to overcome some of the structural problems that – for example – get in the way of creating a joined up experience for fans, in the live space say, where providers offer chatbots to engage with the artist which integrate with ticketing, travel services and merchandising, so to offer a door-to-door experience at a range of different price points”.

He continues: “I think the music industry has an opportunity now to access the money traditionally left under the table by being even more creative in thinking about the fan experience, by stepping into the shoes of the fan, understanding their needs and then delivering a range of integrated or ‘joined up’ offers at different price points”.

What about AI turning computers into composers though? Back to Grierson: “I definitely think that machine learning will provide more people with greater power with respect to creating music. And this will mean that some forms of music composition could be replaced, though, to be honest, this has been the case for years, and you can easily argue that the role of the composer, in many areas of music, is not as important as you might think, and perhaps never was – take manufactured pop, for example”.

However, Grierson reckons that these new technologies will generally become tools for human creators, rather than replacing them. “Machines only do what people tell them”, he says. “Nothing more. Without people to control and drive the process, you end up with nothing”.

“This is where the real benefit will be”, he says. “Machine learning is going to revolutionise what is possible for composers. Once contemporary machine learning is embedded in tools such as non-linear editors and music production systems, music makers are going to suddenly realise that the palette of musical possibility just became so much more awesome”.

“The ability to create new sounds – sounds that no one has heard before – will be much more accessible with machine learning”, he concludes. “This is what we’ve been working on at Goldsmiths, and I think the future is bright for machine learning and the arts”.

Crick agrees, citing a recent collaboration between producer Alex Da Kid and IBM’s AI system Watson. He says: “With Alex Da Kid’s single ‘Not Easy’, humans were involved in training IBM Watson the rules of music, including harmony, chord progressions and rhythm. Watson had also been trained to identify and understand key words in the English language and whether some words and phrases were a positive sentiment. Consequently, Alex Da Kid was able to work with Watson to identify how people express themselves about the subject of heartbreak and break up in relationships”.

Of course, once machines start to get involved in creating original music, that poses some interesting copyright questions, as to who owns the intellectual property in any music created – the creator being the default owner of a song copyright under current laws.

Crick agrees: “I think you have to start with understanding that, right now, US and UK copyright law only acknowledges human creators rather than machines as creators. Meanwhile the law – certainly US law – has failed to define the intangible concepts of creativity, originality and authorship. Which means there are many unknowns with AI that have still to be resolved”.

Though, for the time being, where humans are mainly utilising AI tools in their own composition process, those tools are likely to be treated as instruments, with conventional copyright ownership rules still applied.

By ‘enhanced consumer experience’, we are thinking of innovations in audio and video technology that will revolutionise what it means to experience recorded media. So, what are often termed ‘virtual reality’ and ‘3D audio’.

Let’s deal with sounds first. The BBC’s Iain Tweedale explains: “With standard stereo headphones the sound feels like it’s coming from inside your head. Binaural sound is recorded differently and takes into account how one ear hears a sound slightly differently to the other ear depending on where the sound is coming from. So with binaural it sounds like you’re hearing the sounds from outside your head, like we do in the real 360 degree world, adding depth and height to a music performance”.

“Also, regular stereo sound assumes the listener is static, hearing the sound in one fixed place”, he adds. “Where it gets really cool is with dynamic binaural where you can move your head round to look for the source of the sound. This opens up amazing new opportunities in TV and gaming”.

For its part, the BBC is already dabbling with those new opportunities. “We’re already using them with classical music performance such as this year’s Proms, with TV and radio drama such as Russell T Davies’s ‘Midsummer Night’s Dream’, and currently we’re using it with the ‘Planet Earth 2’ series where viewers can immerse themselves in the amazing sounds of locations like the rain forest”.

Of course, the wider entertainment industries have offered enhanced audio experiences – beyond basic stereo – for some time now. Though, in music, those experiences were usually premium products sold to a niche audience. But will the new 3D audio technologies become a more mainstream thing?

Tweedale: “We’re still at the stage of moving it from the R&D sphere into the regular production process. This requires some industry standards for recording and playback and cost effective capture methods. Once this has been done – which is pretty close now – we can see it becoming a regular feature in big national event radio and TV. It’s also good to see YouTube and Facebook beginning to offer 3D and dynamic binaural capability, as we need regular browser capability to make it widely available. I can also see this going mainstream in VR gaming very soon”.

Which brings us to VR, the creative technology that has probably garnered the most column inches this year. We’ve been promised ‘virtual reality’ experiences before, of course, only to be disappointed. But this time the new tech definitely impresses.

“I think VR is a technology that is here to stay and it offers up tremendous opportunities in media and entertainment”, says Crick. “For music, it offers a new way to engage with fans”.

The co-founder of a company working with VR in music, MelodyVR’s Anthony Matchett, unsurprisingly agrees. “We believe that VR will change the way that artists connect with their fans forever, delivering previously unobtainable experiences with engagement that far outweighs any traditional forms of media”.

The most obvious use of VR in the music space is bringing the live show into people’s homes. Ever since the early days of the web there have been attempts to launch live music streaming platforms. Most failed – Boiler Room perhaps the most notable exception – sometimes because of licensing challenges, other times because the technology just wasn’t ready.

Both record companies and concert promoters see opportunities to generate new revenue streams through VR gig streaming. Not that it will ever quite compete with the real thing – “we don’t believe that events streamed in VR will ever replace the physical experience”, insists Matchett, but, he adds, “VR live streaming enables fans with geographical, financial, age-related or physical constraints to enjoy the ‘live experience’ which is, of course, incredibly exciting for artists, promoters and fans alike”.

But gig streaming is just one possible manifestation of VR in the music space, and may well not be the one that captures the imagination of consumers. We can expect a number of fascinating experiments in this space in the coming years, especially once so called ‘augmented reality’ – which integrates real and virtual experiences – comes of age.

Certainly most people operating in this space reckon that there will be mainstream products to come out of all the VR experimentation that will appeal to mainstream consumers, and therefore artists of all kinds. Indeed, that is already starting to happen, according to Melody VR. “To date, we’ve worked with just under 500 mainstream artists, which demonstrates the appetite for VR across the industry”, says Matchett. “We think VR is relevant to all artists and all labels, regardless of audience”.

In addition to watching the creative experiments in the VR domain, all eyes will also be on which VR technologies gain traction, with most of the big web and tech giants trying to grab a slice of this market. “I think we’ll see some market consolidation over the next couple of years”, says Matchett. And that, crucially for mainstream adoption of VR content, will result in “a vast reduction in the price of hardware”.

It’s impossible to predict exactly which VR manifestations will last the distance, though Crick reckons that success will probably come to those who successfully integrate both VR and AI. “As VR headsets become ubiquitous, then the key will be to use ‘hypersonalisation’ that’s now possible through AI to help people receive pushed-content and find content that matches their own individual behaviour, moods and preferences”.

The one thing we probably can predict, though, is anyone hoping to pause for thought following the big CD-to-download-to-stream transition is out of luck: the music industry’s probably already due another tech-led revolution.

For more information about Music 4.5 click here