GPT-OSS 20B
Specifications
- Input
- Output
- Context window
- 131K tokens
- Released
- Aug 2025
Performance
- Speed
- 153 t/s
- TTFT
- 1.1s
- Latency
- 326 ms
- Intelligence
- —
Pricing
- Input
- €0.03 per 1M tokens
- Output
- €0.13 per 1M tokens
About this model
GPT-OSS 20B is a compact 21B parameter Mixture-of-Experts model with 3.6B active parameters, designed for lower latency and local deployment. Runs within 16GB memory with configurable reasoning effort, full chain-of-thought access, and native agentic capabilities including function calling and structured outputs. Released under Apache 2.0 license, ideal for specialized fine-tuning on consumer hardware. Companion model to GPT-OSS 120B optimized for speed while maintaining strong reasoning capabilities.
Technical specifications
- Capabilities
- Input modalities
- Output modalities
- Reasoning
- Default on high
Knowledge horizon
Knowledge cutoff Jun 2024
Released Aug 2025
Today
Training to release 14 mo Since release 10 mo
See also
Add Model to Comparison
Search for a model to add