DeepSeek V4 Flash

by Deepseek

Specifications

Input
Output
Context window: 1M tokens
Released: Apr 2026

Performance

Speed: 27 t/s
TTFT: 3.2s
Latency: —
Intelligence: —

Pricing

Input: €0.18
Output: €0.35

About this model

DeepSeek V4 Flash is a 284 B parameter Mixture-of-Experts (MoE) chat model from DeepSeek AI. It features a hybrid attention architecture with compressed sparse and heavily compressed attention, supporting a 1 million token context window. The model achieves 88.7 % EM on MMLU, 69.5 % Pass@1 on HumanEval, and 44.7 % EM on LongBench‑V2, demonstrating strong language, coding, and long‑context capabilities. It is released under the MIT License.

Technical specifications

Capabilities
Input modalities
Output modalities
Reasoning: Hybrid Default on

Knowledge horizon

Released Apr 2026

Today

Since release 1 mo

DeepSeek V4 Flash

Specifications

Performance

Pricing

About this model

Technical specifications

Knowledge horizon

See also

Kimi K2.5

MiniMax M2.5

DeepSeek R1 0528