Liquid AI Launches LFM2-Audio-1.5B
Liquid AI, an MIT spin-off, has unveiled LFM2-Audio-1.5B, a voice AI model designed to run on smartphones, wearables, and other low-power devices.
With just 1.5 billion parameters, the model delivers real-time speech-to-speech conversations under 100 milliseconds, making it one of the fastest and most efficient AI audio models ever created.
What Is LFM2-Audio-1.5B?
LFM2-Audio-1.5B is a new voice AI model released by Liquid AI on September 30, 2025. Unlike older systems that use separate tools for speech-to-text, language understanding, and text-to-speech, this model does it all in one unified system.
It is built on Liquid AI’s LFM2 language model (1.2B parameters) and This design reduces complexity, cuts delays, and makes conversations feel natural.
Key innovations include:
- Disentangled Audio Processing: Uses continuous embeddings directly from raw sound for input, avoiding quality loss.
- High-Quality Output: Generates smooth audio using Mimi codec tokens, producing expressive and natural speech.
- Large Context: Supports a 32K token window, allowing it to handle longer conversations.
- Hybrid Design: Combines convolution and attention layers for efficiency and speed.
Speed and Efficiency
- One of the most striking features is latency under 100ms. This means users can talk to the AI almost as if they were speaking to another person.
- Liquid AI says the model runs 10 times faster than competitors while matching or beating the performance of systems 10 times larger.
- This efficiency makes it ideal for devices like smartphones, smart speakers, cars, and healthcare tools.
Capabilities and Applications
LFM2-Audio is not just a chatbot. It can handle multiple tasks through one model, including:
- Real-time speech-to-speech conversation.
- Speech-to-text transcription with accuracy close to OpenAI’s Whisper.
- Text-to-speech synthesis with natural voices.
- Voice commands for cars, smart homes, and IoT devices.
- Meeting transcription with speaker identification.
- Live translation and multilingual support.
- Audio classification for understanding different sound patterns.
By running directly on devices, the model also provides better privacy—sensitive information stays local instead of being sent to the cloud.
Benchmark Results
On VoiceBench, an audio benchmark, LFM2-Audio scored 56.78, outperforming several larger models.
It even matched Whisper-large-v3 in speech recognition tests:
LibriSpeech-clean WER: 2.01% vs. Whisper’s 2.73%.
TED-LIUM WER: 3.56% vs. Whisper’s 3.91%.
These results show that a small, efficient model can compete with much larger systems while staying flexible across tasks.
Open Access for Developers
Liquid AI has released open weights under the LFM Open License v1.0. Access rules include:
- Free for research, academics, and personal use.
- Free for businesses earning under $10M revenue.
- Larger companies must negotiate enterprise licenses.
- Developers can download the model from Hugging Face, test it via the Liquid AI Playground, or use it locally with a simple Python package. Complete code, examples, and integration guides are available on GitHub.
Real-World Use Cases
The efficiency of LFM2-Audio means it can be used in many industries:
Healthcare: Doctors can transcribe notes in real time without sending data to the cloud.
Automotive: Cars can have fully offline, voice-controlled assistants for safer driving.
Customer Service: Businesses can deploy voice bots that respond naturally without high cloud costs.
Content Creation: Podcasters and creators can transcribe and edit audio instantly.
Government & Finance: Privacy-sensitive environments can use secure, on-device voice AI.
News Gist
Liquid AI has launched LFM2-Audio-1.5B, a small yet powerful voice AI model that runs on smartphones and edge devices. With sub-100ms latency, open weights, and high accuracy, it delivers real-time speech AI while protecting privacy and cutting costs.
FAQs
Q1. What is LFM2-Audio-1.5B?
It’s Liquid AI’s new end-to-end audio AI model that combines speech recognition, conversation, and text-to-speech in a single system.
Q2. Who created it?
It was developed by Liquid AI, an MIT spin-off founded in 2023 by Ramin Hasani, Mathias Lechner, Alexander Amini, and Daniela Rus.
Q3. Why is it important?
It runs on smartphones and low-power devices with under 100ms response time, making real-time, private, and offline voice AI possible.
Q4. How accurate is it?
It matches or outperforms larger models like OpenAI’s Whisper in speech recognition, while being 10x faster and much smaller.
Q5. Is it open-source?
Yes. The model weights are available under the LFM Open License v1.0 with free access for research and small businesses.
Q6. Where can developers try it?
It’s available on Hugging Face, GitHub, the Liquid AI Playground, and via a Python package for local use.