Command Palette
Search for a command to run

GLM-4.6

by ZAI

Specifications

Input
Output
Context window
203K tokens
Released
Sep 2025

Performance

Speed
21 t/s
TTFT
929 ms
Latency
Intelligence

Pricing

Input
€0.46
per 1M tokens
Output
€2.02
per 1M tokens

About this model

GLM-4.6 is a frontier-scale 355B parameter Mixture-of-Experts model with a 200K context window and 128K output capability. MIT licensed, making it the only model in its class that enterprises can self-host and deeply customize. Dominates LiveCodeBench v6 (#1, 82.8%), HLE (#1), excels at AIME 2025 (#3, 93.9%) and Terminal-Bench (#3, 40.5%). Near parity with Claude Sonnet 4 (48.6% win rate) while dramatically outperforming other open-source baselines. Purpose-built for agentic workflows, real-world coding, and tool-augmented problem-solving. Supports native tool calling during inference for complex multi-step tasks.

Technical specifications

Capabilities
Input modalities
Output modalities
Reasoning
Hybrid Default off

Knowledge horizon

Knowledge cutoff Aug 2025
Released Sep 2025
Today
Training to release 1 mo Since release 8 mo

See also

Add Model to Comparison
Search for a model to add
Command Palette
Search for a command to run