The Year Audio Became AI's Next Frontier

AudioShake

January 12, 2026

A year ago, AI audio began to dominate headlines with promise. In 2025, it delivered—transforming healthcare, entertainment, logistics, education, and nearly every industry it touched. For audio, the transformation was even more dramatic. What started as a niche corner of AI suddenly became its next frontier, with Meta, OpenAI, and Google racing to perfect voice technology. And AudioShake's audio separation technology found itself at the center of that shift.

‍

The validation came from every direction: AWS CEO Matt Garman featured us in his re:Invent keynote. Meta's SAM Audio leaderboard ranked our models top in our field. We set new state-of-the-art benchmarks. Our technology was used in everything from Oscar-nominated films, voice cloning technology for ALS patients, and in creative apps running on the edge.

‍

Here's how 2025 became the year audio separation moved from technical capability to business imperative.

‍

When Hollywood Needs Your Technology

‍

The entertainment industry doesn't adopt new technology lightly. But when Director Pablo Larraín needed to authentically capture Maria Callas's voice for their Oscar-nominated biopic Maria, they turned to AudioShake. With our tech, his team isolated Callas's vocals from 1960s orchestra recordings, allowing Larraín to layer her authentic voice over Angelina Jolie's performance—a creative solution to a previously impossible challenge.

‍

Decca Records faced a similar challenge: how to create a Dolby Atmos version of Andrea Bocelli and Luciano Pavarotti's 1995 duet, and enable Bocelli to perform live alongside the late Pavarotti's isolated vocals. Our separation technology made both possible.

‍

YouTube creator Mark Rober brought us a different problem: 150 million subscribers watching CrunchLabs science videos that couldn't be localized. His team used AudioShake to separate the dialogue, music, and effects from fully mixed videos, allowing them to swap music, create new edits, and redub content in new languages.

‍

These aren't one-off creative experiments. They reflectl a fundamental shift: audio separation had become essential infrastructure for modern content production.

‍

Enabling Voice Cloning for ALS Patients

‍

The most profound validation came from Bridging Voice, our partnership helping ALS patients preserve their ability to communicate. We separated voices from noisy home videos, enabling patients like Mike G. to create voice clones before the disease silenced them permanently.

‍

Building the Models That Beat the Benchmarks

‍

The technical milestones followed:

State-of-the-art vocal isolation: 13.5 dB Signal-to-Distortion Ratio on MUSDBHQ, surpassing ByteDance's previous benchmark and setting a new industry standard
World-first multi-speaker separation: High-fidelity continuous speaker separation that detects, diarizes, and separates overlapping voices
State-of-the-art lyric transcription and alignment: added new language support and our fastest performing models yet
Real-time capabilities: Production-grade, real-time audio separation released for iOS, MacOS, Android, Windows, and Linux
Launched Music Detection & Removal for live media: Enabling broadcasters and sports leagues to remove copyrighted music from streams and archives

‍

From Startup Competition to Industry Standard

‍

But the real validation came from the industry itself. A year after winning AWS’s Startup Competition, AWS CEO Matt Garman featured AudioShake in his keynote, highlighting how audio separation solves critical problems in media production, AI training, and enterprise communications. Meta's SAM Audio report ranked our models at the top of their leaderboards for deterministic source separation models. And in October, we closed our $14M Series A led by Shine Capital, with participation from Thomson Reuters Ventures and Origin Ventures.

‍

What This Means for 2026

‍

Every major AI player is now betting on audio—not as a feature, but as a fundamental capability. The question isn't whether audio will be central to the next wave of AI. It's who will build the infrastructure that makes it possible.

‍

For AudioShake, 2025 was the year we proved that audio separation isn't just technically impressive—it's business-critical. From Oscar-nominated films to ALS patients to YouTube creators to enterprise broadcasters, the use cases kept expanding. We’re excited for 2026!