Llama 3.3 70B Instruct
Specifications
- Input
- Output
- Context window
- 131K tokens
- Released
- Dec 2024
Performance
- Speed
- 24 t/s
- TTFT
- 515 ms
- Latency
- —
- Intelligence
- —
Pricing
- Input
- €0.12 per 1M tokens
- Output
- €0.36 per 1M tokens
About this model
Meta Llama 3.3 70B Instruct is a multilingual instruction-tuned model optimized for dialogue. Trained on ~15 trillion tokens with cutoff December 2023, it outperforms many open-source and closed models. Major improvements include 92.1% on IFEval (steerability), 88.4% on HumanEval (code), 77.0% on MATH, and 91.1% on MGSM (multilingual). Features 128K context, Grouped-Query Attention, and supports 8 languages including English, German, French, Spanish, Italian, Portuguese, Hindi, and Thai. Trained on 7M GPU hours with 100% renewable energy.
Technical specifications
- Capabilities
- Input modalities
- Output modalities
- Reasoning
- No
Knowledge horizon
Knowledge cutoff Dec 2023
Released Dec 2024
Today
Training to release 12 mo Since release 17 mo
See also
Add Model to Comparison
Search for a model to add