AudioSourceRE and Audionamixs Xtrax Stems are among the first consumer-facing software options for automated demixing. Feed a song into Xtrax, for example, and the software spits out tracks for vocals, bass, drums, and other, that last term doing heavy lifting for the range of sounds heard in most music. Eventually, perhaps, a one-size-fits-all application will truly and instantly demix a recording in full; until then, its one track at a time, and its turning into an art form of its own.
What the Ear Can Hear
At Abbey Road, James Clarke began to chip away at his demixing project in earnest around 2010. In his research, he came across a paper written in the 70s on a technique used to break video signals into component images, such as faces and backgrounds. The paper reminded him of his time as a masters student in physics, working with spectrograms that show the changing frequencies of a signal over time.
Spectrograms could visualize signals, but the technique described in the papercalled non-negative matrix factorizationwas a way of processing the information. If this new technique worked for video signals, it could work for audio signals too, Clarke thought. I started looking at how instruments made up a spectrogram, he says. I could start to recognize, Thats what a drum looks like, that looks like a vocal, that looks like a bass guitar. About a year later, he produced a piece of software that could do a convincing job of breaking apart audio by its frequencies. His first big breakthrough can be heard on the 2016 remaster of the Beatles Live at the Hollywood Bowl, the bands sole official live album. The original LP, released in 1977, is hard to listen to because of the high-pitched shrieks of the crowd.
After unsuccessfully trying to reduce the noise of the crowd, Clarke finally had a serendipity moment. Rather than treating the howling fans as noise in the signal that needed to be scrubbed out, he decided to model the fans as another instrument in the mix. By identifying the crowd as its own individual voice, Clarke was able to tame the Beatlemaniacs, isolating them and moving them to the background. That, then, moved the four musicians to the sonic foreground.
Clarke became a go-to industry expert on upmixing. He helped rescue the 38-CD Grammy-nominated WoodstockBack to the Garden: The Definitive 50th Anniversary Archive, which aimed to assemble every single performance from the 1969 mega-festival. (Disclosure: I contributed liner notes to the set.) At one point during some of the festivals heaviest rain, sitar virtuoso Ravi Shankar took to the stage. The biggest problem with the recording of the performance wasnt the rain, however, but that Shankars then-producer absconded with the multitrack tapes. After listening to them back in the studio, Shankar deemed them unusable and released a faked-in-the-studio At the Woodstock Festival LP instead, with not a note from Woodstock itself. The original festival multitracks disappeared long ago, leaving future reissue producers nothing but a damaged-sounding mono recording off the concert soundboard.
Using only this monaural recording, Clarke was able to separate the sitar masters instrument from the rain, the sonic crud, and the tabla player sitting a few feet away. The result was both completely authentic and accurate, with bits of ambiance still in the mix, says the box sets coproducer, Andy Zax.
The possibilities upmixing gives us to reclaim the unreclaimable are really exciting, Zax says. Some might see the technique as akin to colorizing classic black-and-white movies. Theres always that tension. You want to be reconstructive, and you dont really want to impose your will on it. So that’s the challenge.
Heading for the Deep End
Around the time Clarke finished working on the Beatles Hollywood Bowl project, he and other researchers were coming up against a wall. Their techniques could handle fairly simple patterns, but they couldnt keep up with instruments with lots of vibratothe subtle changes in pitch that characterize some instruments and the human voice. The engineers realized they needed a new approach. Thats what led toward deep learning, says Derry Fitzgerald, the founder and chief technology officer of AudioSourceRE, a music software company.