TheNextPort

The Future of Audio Creation with Stable Audio

With Stable Audio, anyone can create high-quality music and sound effects using text prompts, even if they have no prior experience with music or audio production.

By Michael Bires

michael90212 a woman staring into nothing with headphones side b6eb2230-60e7-42a3-a135-1b4149fb92b8

Stability AI, recognized for creating AI-generated visuals, has introduced a new text-to-audio generative AI platform called Stable Audio.

Stable Audio uses a diffusion model, the same technology behind Stability AI’s widely used image platform, Stable Diffusion, but with a focus on audio rather than images. This platform enables users to create songs or background audio for various projects.

Most audio diffusion models generate audio of a fixed length, which isn’t ideal for music production where the length of songs can vary. Stability AI’s platform, however, allows users to create sounds of different lengths by training on music and incorporating text metadata to mark a song’s start and end time.

Before, training audio on a 30-second clip meant it could only generate 30 seconds of audio, creating arbitrary sections of songs. With modifications to the model, Stability AI now offers users of Stable Audio more control over the length of the song.

Stable Audio is at the forefront of audio generation research, developed by our generative audio research lab, Harmonai. We are consistently refining our model architectures, datasets, and training processes to enhance output quality, controllability, inference speed, and output length.

Stable Audio trained on a dataset of over 800,000 audio files, including music, sound effects, and single-instrument stems, along with text metadata from AudioSparx, a stock music licensing company. This dataset represents more than 19,500 hours of audio.

Here are some of the features of Stable Audio:

  • It generates audio in high-quality, 44.1 kHz stereo.

  • It can be conditioned on text metadata as well as audio file duration and start time, allowing for control over the content and length of the generated audio.

  • You can generate music, sound effects, and human speech.

  • Easy to use, even for people with no experience with music or audio production.

Stable Audio offers three pricing tiers: a free version for creating up to 45 seconds of audio for 20 tracks a month; a $11.99 Professional level for 500 tracks up to 90 seconds each; and an Enterprise subscription for companies to customize their usage and price. The free version restricts users from using the audio commercially.

Stable Audio could be useful for creating background tunes for podcasts or videos, but we tested it and it can also handle different music genres, like house and trance, similar to the tracks by Armin Van Buuren, which you can listen below.

Prompt: House, Progressive, Synthesizer, 909, Dramatic Chords, Choir, Euphoric, Bass, Piano, Guitars, 128 BPM
Prompt: House, Progressive, Synthesizer, 909, Dramatic Chords, Choir, Euphoric, Bass, Piano, Guitars, 128 BPM

Stable Audio is on the beginning to shake things up in the world of audio creation. It can easily help to creators try out new things in music and sound production or you can use it just for the inspiration. If you’re making background tunes for a podcast or experimenting with different music genres, this platform has something for everyone. Its user-friendly approach and available pricing options, it’s making high-quality audio creation accessible to a quite abroad audience.

No spam. Twice a month.
Unsubscribe anytime.

Sign up to our newsletter and receive a selection of cool articles weekly.

By clicking “Sign Up”, you accept our Terms of Service and Privacy Policy. You can opt-out at any time.