Command Palette
Search for a command to run

Hermes 4 70B

by NousResearch

Specifications

Input
Output
Context window
128K tokens
Released
Aug 2025

Performance

Speed
86 t/s
TTFT
123 ms
Latency
398 ms
Intelligence

Pricing

Input
€0.12
per 1M tokens
Output
€0.40
per 1M tokens

About this model

NousResearch Hermes 4 70B is a frontier hybrid-mode reasoning model based on Llama-3.1-70B, trained on ~60B tokens across ~5M samples. Features explicit <think> deliberation segments with massive improvements in math, code, STEM, logic, and creativity. Achieves SOTA on RefusalBench for helpful, uncensored responses while maintaining alignment to user values. Supports schema-adherent structured JSON outputs, function calling, and tool use. Trained for extreme steerability with reduced refusal rates compared to previous Hermes versions.

Technical specifications

Capabilities
Input modalities
Output modalities
Reasoning
Hybrid Default off

Knowledge horizon

Released Aug 2025
Today
Since release 10 mo

See also

Add Model to Comparison
Search for a model to add
Command Palette
Search for a command to run