FAQ

How can I get my audio stemmed?

For music, we have an on demand platform designed specifically for industry professionals called AudioShake Live where you can quickly upload your songs and create stems for them. Get in touch for a demo and free trial. For independent artists and labels, we have AudioShake Indie.

We've also integrated across a number of platforms in the sync and localization industries. Music supervisors can make use of our services on Chordal. Dubbing freelancers and studios can find our technology already embedded in their workflow tools including OOONA and Yella Umbrella, as well as through services including Dubverse and cielo24.

If you are a developer, check out our documentation center about ways to integrate our API.‍

How does AudioShake's technology work?

AudioShake uses A.I. to recognize different components in a piece of audio--for example, the drums in a rock song. We then isolate that stem so you can use it for new purposes--sampling, synch licensing, re-mastering, re-mixes, and more.

I've seen similar tech like this in the past and it always falls short. What makes AudioShake different?

There have long been efforts to isolate tracks within music, but recent advances in artificial intelligence have made it possible for a big leap in quality. We believe AudioShake's technology is the best in the industry, and significantly outperforms all other offerings. In fact we are the winner's of Sony's Demixing Challenge, which pitted our stem separation against 40 other teams, including Big Tech companies, start-ups, and research institutes.

What kinds of sound separation do you offer?

AudioShake offers a range of models, centering on music and "dirty audio" tasks:

Music: separate different instruments, or create an instrumental. You can also separate multiple singers from a track.
Dialogue, Music, & Effects: separate speech or dialogue from background audio, or separate effects from music.
Multi-speaker Separation: separate overlapping speakers in a podcast, video, or speech file
Lyric Transcription & Alignment: Transcribe lyrics from a song, then align them with word-by-word time stamping, to create a karaoke-type experience.

Is this available via API or on-device?

Yes! All our separations are via API, and many are also available on-device. Our documentation site is here.

We match your file inputs and can support up to 192kHz, which can be exported in the following file formats: WAV, MP3, AAC, FLAC, AIFF, and PCM. For transcriptions, we can export text as JSON or TXT files. 

More details on the formats we offer and support can be found on our developer page. 

Do you do speech transcription and alignment?

No, in transcription, we only focus on lyrics, which are a different research task from speech. However, we work with many speech transcription and captioning services, in order to clean dialogue before it goes through automated speech recognition (ASR).

What file formats and resolutions do you offer?

We match your file inputs and can support up to 192kHz, which can be exported in the following file formats: WAV, MP3, AAC, FLAC, AIFF, and PCM. For transcriptions, we can export text as JSON or TXT files. 

More details on the formats we offer and support can be found on our developer page. 

Get in touch.
Reach our sales and customer support teams directly. Start here or try a demo.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.