GPT-OSS 120B
Specifications
- Input
- Output
- Context window
- 131K tokens
- Veröffentlicht
- Aug 2025
Performance
- Speed
- 156 t/s
- TTFT
- 82 ms
- Latency
- 237 ms
- Intelligence
- —
Pricing
- Eingabe
- €0.15 per 1M tokens
- Ausgabe
- €0.60 per 1M tokens
Über dieses Modell
GPT-OSS 120B is a powerful 117B parameter Mixture-of-Experts reasoning model with 5.1B active parameters, released under Apache 2.0. Features configurable reasoning effort (low/medium/high), full chain-of-thought visibility, and runs on a single 80GB GPU thanks to MXFP4 quantization. Native support for function calling, web browsing, Python code execution, and structured outputs. Designed for agentic tasks and complex reasoning with production-grade performance. Fully customizable for specialized use cases on single H100/MI300X.
Technische Daten
- Fähigkeiten
- Eingabe-Modalitäten
- Ausgabe-Modalitäten
- Reasoning
- Standard on high
Knowledge horizon
Wissensstand Jun 2024
Veröffentlicht Aug 2025
Today
Training to release 14 mo Since release 10 mo
See also
Modell zum Vergleich hinzufügen
Nach einem Modell suchen