AI tools
December 4, 2023

Stability AI Introduces Stable Audio - Generate Music For FREE

The music industry is about to get disrupted with this AI powered tool.

Jim Clyde Monge
Jim Clyde Monge

The music industry is about to be revolutionized.

Today, Stability AI, the king of open-source AI tools and models like Stable Diffusion and StableLM, launched StableAudio, its first AI product for music and sound generation.

The music industry is notoriously difficult to break into. Even if you have the talent and the drive, you still need the skills and resources to create and produce music.

But what if you didn’t need any of that? What if you could create music with just a creative mind and a good AI prompt?

StableAudio is an AI tool that can generate music from scratch. All you need to do is provide a few simple instructions, and the AI will do the rest.

What is StableAudio?

StableAudio is a first-of-its-kind AI tool that uses generative AI techniques to create high-quality music and sound effects.

To use StableAudio, you simply provide a descriptive text prompt and a desired length of audio. For example, you could enter “Post-Rock, Guitars, Drum Kit, Bass, Strings, Euphoric, Up-Lifting, Moody, Flowing, Raw, Epic, Sentimental, 125 BPM” to generate a 95-second track in the style of post-rock.

StableAudio is ideal for musicians seeking to create samples to use in their music. You can use it to create sound effects, background music, or even your own original compositions.

Try it yourself

Head over to the StableAudio dashboard and sign up.

StableAudio dashboard

Then, go to the “Generate Music” dashboard to start generating your own music.

StableAudio generate dashboard

Type in your prompt and set the duration. Keep in mind that the maximum length of audio for the free subscription is 20 seconds.

Prompt: Calm meditation music to play in a spa lobby

Click the little right arrow button to begin the audio generation.

At the moment, the website is experiencing heavy traffic, so it’s not working. I will update the article once the website is back up and running.

Stable Audio We’re seeing a lot of traffic

In the meantime, you can explore the examples provided in the User Guide section of StableAudio.

StableAudio User Guide examples

How it works

Here are some of the key technical details of how StableAudio works:

StableAudio technical background
  • The VAE compresses stereo audio into a data-compressed, noise-resistant, and invertible lossy latent encoding that allows for faster generation and training than working with the raw audio samples themselves.
  • The text encoder is used to extract features from the text prompt. These features are then used to condition the diffusion model.
  • The diffusion model is a U-Net-based model that uses a combination of residual layers, self-attention layers, and cross-attention layers to denoise the input and reconstruct the desired audio.

I won’t bore you with a very long explanation of how it works in the background. If you want to know more about the technical details, read this blog from Stability AI.

Also, another important piece of information is that the StableAudio model used a dataset of over 800,000 audio files, including music, sound effects, and single-instrument stems. That’s equal to over 19,500 hours of audio.

How much does it cost?

If you want to generate your own music for personal use, it’s completely free. However, if you want to use the content for commercial purposes, you need to upgrade to the Pro tier. Here are the pricing tiers:

  • Free: 20 monthly tracks up to 45 seconds each.
  • Professional ($11.99 per month): 500 monthly tracks up to 90 seconds each.
  • Enterprise (custom amount)
StableAudio Pricing
StableAudio Pricing

Final Thoughts

Overall, I am impressed with this new AI tool. The quality of the audio is on par with the ones created by a human professional.

StableAudio is a game-changer, and it could disrupt the entire music and sound effects industry. For sure, some professional musicians may be mad at its arrival, but I doubt we can stop it. They’ll see it as a threat to their livelihood.

After all, technology is constantly evolving, and there’s no going back.

It can be used for good or for bad. It’s up to us to decide how we use it. I, for one, am excited to see what the future holds for this technology. I believe it has the potential to make music more accessible and creative than ever before.

Get your brand or product featured on Jim Monge's audience