GLM-4.6

by ZAI

Specifications

Input
Output
Context window: 203K tokens
Released: Sep 2025

Performance

Speed: 21 t/s
TTFT: 929 ms
Latency: —
Intelligence: —

Pricing

Input: €0.46
Output: €2.02

About this model

GLM-4.6 is a frontier-scale 355B parameter Mixture-of-Experts model with a 200K context window and 128K output capability. MIT licensed, making it the only model in its class that enterprises can self-host and deeply customize. Dominates LiveCodeBench v6 (#1, 82.8%), HLE (#1), excels at AIME 2025 (#3, 93.9%) and Terminal-Bench (#3, 40.5%). Near parity with Claude Sonnet 4 (48.6% win rate) while dramatically outperforming other open-source baselines. Purpose-built for agentic workflows, real-world coding, and tool-augmented problem-solving. Supports native tool calling during inference for complex multi-step tasks.

Technical specifications

Capabilities
Input modalities
Output modalities
Reasoning: Hybrid Default off

Knowledge horizon

Knowledge cutoff Aug 2025

Released Sep 2025

Today

Training to release 1 mo Since release 8 mo

GLM-4.6

Specifications

Performance

Pricing

About this model

Technical specifications

Knowledge horizon

See also

Kimi K2.5

MiniMax M2.5

DeepSeek V4 Flash