Command Palette
Search for a command to run

GPT-OSS 120B

by OpenAI

Specifications

Input
Output
Context window
131K tokens
Released
Aug 2025

Performance

Speed
158 t/s
TTFT
141 ms
Latency
Intelligence

Pricing

Input
€0.19
per 1M tokens
Output
€0.75
per 1M tokens

About this model

GPT-OSS 120B is a powerful 117B parameter Mixture-of-Experts reasoning model with 5.1B active parameters, released under Apache 2.0. Features configurable reasoning effort (low/medium/high), full chain-of-thought visibility, and runs on a single 80GB GPU thanks to MXFP4 quantization. Native support for function calling, web browsing, Python code execution, and structured outputs. Designed for agentic tasks and complex reasoning with production-grade performance. Fully customizable for specialized use cases on single H100/MI300X.

Technical specifications

Capabilities
Input modalities
Output modalities
Reasoning
Default on high

Knowledge horizon

Knowledge cutoff Jan 2025
Released Aug 2025
Today
Training to release 7 mo Since release 9 mo

See also

Add Model to Comparison
Search for a model to add
Command Palette
Search for a command to run