GPT-OSS 120B
Specifications
- Input
- Output
- Context window
- 131K tokens
- Released
- Aug 2025
Performance
- Speed
- 158 t/s
- TTFT
- 141 ms
- Latency
- —
- Intelligence
- —
Pricing
- Input
- €0.19 per 1M tokens
- Output
- €0.75 per 1M tokens
About this model
GPT-OSS 120B is a powerful 117B parameter Mixture-of-Experts reasoning model with 5.1B active parameters, released under Apache 2.0. Features configurable reasoning effort (low/medium/high), full chain-of-thought visibility, and runs on a single 80GB GPU thanks to MXFP4 quantization. Native support for function calling, web browsing, Python code execution, and structured outputs. Designed for agentic tasks and complex reasoning with production-grade performance. Fully customizable for specialized use cases on single H100/MI300X.
Technical specifications
- Capabilities
- Input modalities
- Output modalities
- Reasoning
- Default on high
Knowledge horizon
Knowledge cutoff Jan 2025
Released Aug 2025
Today
Training to release 7 mo Since release 9 mo
See also
Add Model to Comparison
Search for a model to add