Mistral Small 4 119B Instruct
Specifications
- Input
- Output
- Context window
- 60K tokens
- Released
- Mar 2026
Performance
- Speed
- 93 t/s
- TTFT
- 328 ms
- Latency
- 294 ms
- Intelligence
- —
Pricing
- Input
- €0.15 per 1M tokens
- Output
- €0.60 per 1M tokens
About this model
Mistral Small 4 is a 119B-parameter Mixture-of-Experts model (128 experts, 4 active per token, 6.5B active parameters) that unifies instruct, reasoning, and coding capabilities into a single multimodal model. It accepts text and image inputs, supports function calling, structured outputs, and configurable reasoning effort (none for fast responses, high for deep step-by-step reasoning). With a 256K context window and Apache 2.0 license, it delivers 40% lower latency and 3x higher throughput compared to Mistral Small 3.
Technical specifications
- Capabilities
- Input modalities
- Output modalities
- Reasoning
- Hybrid Default off high
Knowledge horizon
Knowledge cutoff Nov 2024
Released Mar 2026
Today
Training to release 16 mo Since release 3 mo
See also
Add Model to Comparison
Search for a model to add