Hermes 4 70B
Specifications
- Input
- Output
- Context window
- 128K tokens
- Released
- Aug 2025
Performance
- Speed
- 86 t/s
- TTFT
- 123 ms
- Latency
- 398 ms
- Intelligence
- —
Pricing
- Input
- €0.12 per 1M tokens
- Output
- €0.40 per 1M tokens
About this model
NousResearch Hermes 4 70B is a frontier hybrid-mode reasoning model based on Llama-3.1-70B, trained on ~60B tokens across ~5M samples. Features explicit <think> deliberation segments with massive improvements in math, code, STEM, logic, and creativity. Achieves SOTA on RefusalBench for helpful, uncensored responses while maintaining alignment to user values. Supports schema-adherent structured JSON outputs, function calling, and tool use. Trained for extreme steerability with reduced refusal rates compared to previous Hermes versions.
Technical specifications
- Capabilities
- Input modalities
- Output modalities
- Reasoning
- Hybrid Default off
Knowledge horizon
Released Aug 2025
Today
Since release 10 mo
See also
Add Model to Comparison
Search for a model to add