Command Palette
Search for a command to run

Llama 3.1 8B Instruct

by Meta

Specifications

Input
Output
Context window
131K tokens
Released
Jul 2024

Performance

Speed
111 t/s
TTFT
392 ms
Latency
Intelligence

Pricing

Input
€0.06
per 1M tokens
Output
€0.27
per 1M tokens

About this model

Meta Llama 3.1 8B Instruct is an efficient multilingual instruction-tuned model optimized for dialogue and assistant use cases. With 8 billion parameters and 128K context length, it provides strong performance across general tasks, code generation, and multilingual understanding. Supports function calling and tool use with Grouped-Query Attention architecture. Ideal for deployment scenarios requiring lower compute resources while maintaining quality across English and 7 additional languages including German, French, Spanish, and Hindi.

Technical specifications

Capabilities
Input modalities
Output modalities
Reasoning
No

Knowledge horizon

Knowledge cutoff Dec 2023
Released Jul 2024
Today
Training to release 7 mo Since release 22 mo

See also

Add Model to Comparison
Search for a model to add
Command Palette
Search for a command to run