“Stability AI” and “Arm” Revolutionize AI Audio for Mobile Devices
On March 3, 2025, at MWC Barcelona, Stability AI announced a partnership with chipmaker Arm to improve its “Stable Audio Open” model for mobile devices using Arm chips.
It can generate audio including sound effects from text descriptions and is designed to work offline.
Key Points
- Optimizing Stable Audio Open for mobile devices began as a significant challenge, with initial audio generation on an Arm CPU taking 240 seconds.
- By refining the model and using Arm’s software stack, Stability Audio Open generation time was reduced to 8 seconds for an 11-second clip, a 30x speed boost.
- Stability claims Stable Audio Open’s training set is made up entirely of royalty-free audio and songs.
- The optimized model is not yet available for download, but Stability’s CEO indicated plans to make it accessible for consumer apps in the future.
- According to Stability AI, “Audio is just the start. Our partnership with Arm brings cutting-edge AI models for image, video, and 3D to mobile devices, revolutionizing high-quality media generation across all visual formats.”
Background
Stability AI, a prominent startup, recognized for development of generative AI models such as Stable Diffusion, which has transformed image generation, and Stable Audio, tailored for AI-driven music and sound creation.
In contrast, Arm, a global leader in semiconductors, excels in power-efficient processors extensively employed in mobile and embedded systems.
Their collaboration to optimize Stable Audio for Arm chips, thereby enhancing the accessibility of AI-powered audio generation.
Stability AI has also ventured into Stable Video and Stable 3D, with the objective of extending AI creativity across various media formats.
News Gist
At MWC Barcelona 2025, Stability AI and Arm announced an optimized Stable Audio Open model for Arm-powered mobile devices, achieving 30x faster audio generation.
Stable future plans include expanding AI to image, video, and 3D.