Llama 3.1 8B Instruct

by Meta

Specifications

Input
Output
Context window: 131K tokens
Released: Jul 2024

Performance

Speed: 60 t/s
TTFT: 494 ms
Latency: 157 ms
Intelligence: —

Pricing

Input: €0.15
Output: €0.15

About this model

Meta Llama 3.1 8B Instruct is an efficient multilingual instruction-tuned model optimized for dialogue and assistant use cases. With 8 billion parameters and 128K context length, it provides strong performance across general tasks, code generation, and multilingual understanding. Supports function calling and tool use with Grouped-Query Attention architecture. Ideal for deployment scenarios requiring lower compute resources while maintaining quality across English and 7 additional languages including German, French, Spanish, and Hindi.

Technical specifications

Capabilities
Input modalities
Output modalities
Reasoning: No

Knowledge horizon

Knowledge cutoff Dec 2023

Released Jul 2024

Today

Training to release 7 mo Since release 23 mo

Llama 3.1 8B Instruct

Specifications

Performance

Pricing

About this model

Technical specifications

Knowledge horizon

See also

Kimi K2.5

Qwen 3.6 27B

Gemma 4 31B