Nemotron 3 Super 120B A12B FP8

von NVIDIA

Specifications

Input
Output
Context window: 262K tokens
Veröffentlicht: Mar 2026

Performance

Speed: 2 t/s
TTFT: 8.0s
Latency: 10.6s
Intelligence: —

Pricing

Eingabe: €0.30
Ausgabe: €0.90

Über dieses Modell

NVIDIA Nemotron 3 Super 120B A12B FP8 is a 120B parameter (12B active) LatentMixture-of-Experts hybrid model with Mamba-2, MoE and Multi-Token Prediction layers, supporting up to 1M tokens context. It achieves 94.73% on HMMT Feb25 (with tools) and 83.73% on MMLU‑Pro, and scores 73.88% on Arena‑Hard‑V2 (Hard Prompt). The model supports configurable reasoning via an enable_thinking flag, tool use, and structured output. It is available under the NVIDIA Nemotron Open Model License.

Technische Daten

Fähigkeiten
Eingabe-Modalitäten
Ausgabe-Modalitäten
Reasoning: Hybrid Standard off

Knowledge horizon

Wissensstand Feb 2026

Veröffentlicht Mar 2026

Today

Training to release 1 mo Since release 4 mo

Nemotron 3 Super 120B A12B FP8

Specifications

Performance

Pricing

Über dieses Modell

Technische Daten

Knowledge horizon

See also

Kimi K2.6

Qwen 3.6 27B

Gemma 4 31B