Gemma 4 31B
Specifications
- Input
- Output
- Context window
- 256K tokens
- Released
- Apr 2026
Performance
- Speed
- 65 t/s
- TTFT
- 682 ms
- Latency
- 2.9s
- Intelligence
- —
Pricing
- Input
- €0.10 per 1M tokens
- Output
- €0.30 per 1M tokens
About this model
Google Gemma 4 31B is a 31B parameter dense multimodal language model with a 256K context window. It processes text, images, and video inputs and generates text output, featuring a configurable thinking mode for step‑by‑step reasoning. The model achieves 85.2% on MMLU Pro, 80.0% on LiveCodeBench v6, and 88.4% on MMMLU, demonstrating strong performance across reasoning and multimodal benchmarks. Available under the Apache 2.0 license.
Technical specifications
- Capabilities
- Input modalities
- Output modalities
- Reasoning
- Hybrid Default off
Knowledge horizon
Knowledge cutoff Jan 2025
Released Apr 2026
Today
Training to release 15 mo Since release 2 mo
See also
Add Model to Comparison
Search for a model to add