AudioShake expands into dialogue, music, and effects separation for production studios and dubbing services

April 11, 2024

AudioShake, the leading stem separation company, today announced the debut of its dialogue, music, and effects separation technology. The new service leverages AudioShake’s award-winning AI to cleanly separate dialogue, music, and effects from a video or audio file in a matter of seconds, improving the accuracy, cost, and time spent producing dubbed or captioned content.

Prior to AudioShake, the broadcast and production industries have depended on automated speech recognition (ASR), artificial intelligence, and machine learning technologies to open up localization services. Unfortunately, these have fallen short in creating clean dialogue tracks when recordings have poor quality, low fidelity, or considerable background noise – as often is the case with older content. 

Until now, the process to course correct for interferences in dialogue tracks have proved costly and time consuming. Trained to differentiate between vocals, music, and “other” noises, AudioShake’s application for dubbing has unlocked a decades-long challenge for the industry.

With AudioShake, users can create clean dialogue tracks before turning to other ASR, dubbing, or captioning technologies and gain more accurate captions. And unlike traditional services, AudioShake is capable of retaining the music and effects tracks, which can be pivotal to delivering a high-quality final product that is true to the original content. 

“Increasingly, content localization is becoming a critical aspect for content creators, corporations, and studios to grow their business. When we saw the difference AudioShake stems were making for our early dubbing and localization partners, we knew it was important for us to bring it to all of our users,” said Jessica Powell, Co-Founder and CEO of AudioShake. “It’s been amazing to find new ways for stems to be of use to different industries, and see the applications of our technology beyond music.”

Though this is the first time the company has deployed this technology on its platform, AudioShake’s dialogue, music, and effects separation model has already been adopted by film and TV studios, and integrated with dubbing and captioning services including cielo24, Dubverse, OOONA, Pandastorm Pictures, Yella Umbrella, and more. Partners have seen improvements to transcription accuracy by more than 20%.

"Video publishers understand the value of expanding their reach to new markets, but they often face obstacles in cost and complexity when it comes to localization," said Shanna Johnson, CEO of "Our joint offering with AudioShake is an innovative solution that enables publishers to accurately, easily, and cost-effectively localize their video content at scale to engage new audiences worldwide."

Those interested in using AudioShake’s dialogue, music, and effects separation model or other stem separation capabilities, can join at or email for more details.

About AudioShake

AudioShake makes audio interactive, customizable, and accessible by using AI to separate sound recordings into their component parts and stems. Winner of Sony’s Demixing Challenge, AudioShake’s best-in-class technology is used by major and indie labels, music publishers, film studios, dubbing companies, and apps for uses such as localization, captioning, immersive audio, sync licensing, fan engagement, voice synthesis, gaming, and more.