Z AI

GLM 4.6

Compared with GLM-4.5, this generation brings several key improvements: Longer context window: The context window has been expanded from 128K to 200K tokens, enabling the model to handle more complex agentic tasks. Superior coding performance: The model achieves higher scores on code benchmarks and demonstrates better real-world performance in applications such as Claude Code、Cline、Roo Code and Kilo Code, including improvements in generating visually polished front-end pages. Advanced reasoning: GLM-4.6 shows a clear improvement in reasoning performance and supports tool use during inference, leading to stronger overall capability. More capable agents: GLM-4.6 exhibits stronger performance in tool using and search-based agents, and integrates more effectively within agent frameworks. Refined writing: Better aligns with human preferences in style and readability, and performs more naturally in role-playing scenarios.

Input / 1M tokens: $0.390
Output / 1M tokens: $1.90
Context window: 205K tokens
Provider: Z AI
Knowledge cutoff: 2025-03-31

Performance

Median streaming throughput and first-token latency measured by Artificial Analysis.

Output tokens / sec: 67 t/s
Time to first token: 0.95s

Benchmarks

Intelligence, coding, and math indexes plus the underlying evaluation scores.

Intelligence Index: 30
Coding Index: 30
Math Index: 44
MMLU-Pro: 78.4%
GPQA: 63.2%
HLE: 5.2%
LiveCodeBench: 56.1%
SciCode: 33.1%
MATH-500: —
AIME: —

Benchmarks via Artificial Analysis