Voxtral Small 24B 2507
Specifications
- Input
- Output
- Context window
- 32K tokens
- Released
- Jul 2025
Performance
- Speed
- 64 t/s
- TTFT
- 244 ms
- Latency
- 68 ms
- Intelligence
- —
Pricing
- Input
- €0.15 per 1M tokens
- Output
- €0.35 per 1M tokens
About this model
Mistral Voxtral Small 24B is a multimodal model supporting both text and audio inputs with 24B parameters. Enables natural voice conversations and audio understanding alongside text processing. Features audio transcription, audio-based reasoning, and voice-to-text capabilities. Built on Mistral architecture with specific training for audio modalities. Ideal for voice assistants, audio analysis applications, and multimodal AI systems requiring combined text and speech processing.
Technical specifications
- Capabilities
- Input modalities
- Output modalities
- Reasoning
- No
Knowledge horizon
Released Jul 2025
Today
Since release 11 mo
See also
Add Model to Comparison
Search for a model to add