Nemotron 3 Super 120B A12B FP8
Specifications
- Input
- Output
- Context window
- 262K tokens
- Veröffentlicht
- Mar 2026
Performance
- Speed
- 2 t/s
- TTFT
- 221 ms
- Latency
- 199 ms
- Intelligence
- —
Pricing
- Eingabe
- €0.30 per 1M tokens
- Ausgabe
- €0.90 per 1M tokens
Über dieses Modell
NVIDIA Nemotron 3 Super 120B A12B FP8 is a 120B parameter (12B active) LatentMixture-of-Experts hybrid model with Mamba-2, MoE and Multi-Token Prediction layers, supporting up to 1M tokens context. It achieves 94.73% on HMMT Feb25 (with tools) and 83.73% on MMLU‑Pro, and scores 73.88% on Arena‑Hard‑V2 (Hard Prompt). The model supports configurable reasoning via an enable_thinking flag, tool use, and structured output. It is available under the NVIDIA Nemotron Open Model License.
Technische Daten
- Fähigkeiten
- Eingabe-Modalitäten
- Ausgabe-Modalitäten
- Reasoning
- Hybrid Standard off
Knowledge horizon
Wissensstand Feb 2026
Veröffentlicht Mar 2026
Today
Training to release 1 mo Since release 3 mo
See also
Modell zum Vergleich hinzufügen
Nach einem Modell suchen