Hermes 4 70B

by NousResearch

Specifications

Input
Output
Context window: 128K tokens
Released: Aug 2025

Performance

Speed: 86 t/s
TTFT: 187 ms
Latency: 119 ms
Intelligence: —

Pricing

Input: €0.12
Output: €0.40

About this model

NousResearch Hermes 4 70B is a frontier hybrid-mode reasoning model based on Llama-3.1-70B, trained on ~60B tokens across ~5M samples. Features explicit <think> deliberation segments with massive improvements in math, code, STEM, logic, and creativity. Achieves SOTA on RefusalBench for helpful, uncensored responses while maintaining alignment to user values. Supports schema-adherent structured JSON outputs, function calling, and tool use. Trained for extreme steerability with reduced refusal rates compared to previous Hermes versions.

Technical specifications

Capabilities
Input modalities
Output modalities
Reasoning: Hybrid Default off

Knowledge horizon

Released Aug 2025

Today

Since release 11 mo

Hermes 4 70B

Specifications

Performance

Pricing

About this model

Technical specifications

Knowledge horizon

See also

Kimi K2.6

Qwen 3.6 27B

Gemma 4 31B