Command Palette
Search for a command to run

Hermes 4 405B

by NousResearch

Specifications

Input
Output
Context window
128K tokens
Released
Aug 2025

Performance

Speed
41 t/s
TTFT
224 ms
Latency
228 ms
Intelligence

Pricing

Input
€0.95
per 1M tokens
Output
€2.85
per 1M tokens

About this model

NousResearch Hermes 4 405B is the flagship hybrid-mode reasoning model based on Meta's Llama-3.1-405B architecture. Trained on a massive ~60B token corpus with explicit <think> deliberation segments, it delivers frontier-level performance in math, code, STEM, logic, and creative tasks. Achieves SOTA on RefusalBench for helpful, uncensored responses aligned to user values. Supports advanced function calling, structured JSON outputs, and tool use with extreme steerability and reduced refusal rates.

Technical specifications

Capabilities
Input modalities
Output modalities
Reasoning
Hybrid Default off

Knowledge horizon

Released Aug 2025
Today
Since release 10 mo

See also

Add Model to Comparison
Search for a model to add
Command Palette
Search for a command to run