AudioShake Unveils State-of-the-Art Vocal Isolation Model

May 8, 2025

Today, AudioShake announces its highest-quality vocal model to date, setting a new, state-of-the-art benchmark for vocal isolation.

While our efforts are always first and foremost focused on perceptual quality–meaning, how the output of our stems actually sound to real human ears–we also regularly test our models on quantitative benchmarks widely used across the industry. With an SDR of 13.5 dB on MUSDBHQ, today’s vocal model release surpasses the state-of-the-art benchmark previously set by ByteDance in 2024, and builds on our success in industry challenges like the Sony Demixing Challenge.

AudioShake’s new vocal model delivers cleaner, more natural separations by capturing subtle details like long-tail reverb, preserving the original vocal’s depth and timbre, and more proficiently picking up vocal harmonies. It surpasses previous models in both perceptual listening tests and quantitative benchmarks. 

Already made available to some of our beta testers, AudioShake’s new vocal model has been hailed for its precision and quality. 

“The new vocal model from AudioShake is a big step up. The separation is cleaner, and there’s a noticeable boost in clarity without losing the feel of the original mix. Reverb tails are better preserved, and the stereo image stays intact, which helps the vocal sit more naturally in the track. The result is a spacious, defined sound—easily the best I’ve heard.” – Daniel Rowland, Audio Engineer and Co-Founder of Immersive Mixers

Hear for yourself how AudioShake’s vocal model has improved over the years.