Llama 3.1 8B Instruct
Specifications
- Input
- Output
- Context window
- 131K tokens
- Released
- Jul 2024
Performance
- Speed
- 111 t/s
- TTFT
- 392 ms
- Latency
- —
- Intelligence
- —
Pricing
- Input
- €0.06 per 1M tokens
- Output
- €0.27 per 1M tokens
About this model
Meta Llama 3.1 8B Instruct is an efficient multilingual instruction-tuned model optimized for dialogue and assistant use cases. With 8 billion parameters and 128K context length, it provides strong performance across general tasks, code generation, and multilingual understanding. Supports function calling and tool use with Grouped-Query Attention architecture. Ideal for deployment scenarios requiring lower compute resources while maintaining quality across English and 7 additional languages including German, French, Spanish, and Hindi.
Technical specifications
- Capabilities
- Input modalities
- Output modalities
- Reasoning
- No
Knowledge horizon
Knowledge cutoff Dec 2023
Released Jul 2024
Today
Training to release 7 mo Since release 22 mo
See also
Add Model to Comparison
Search for a model to add