Z AI
GLM 4.6
Compared with GLM-4.5, this generation brings several key improvements: Longer context window: The context window has been expanded from 128K to 200K tokens, enabling the model to handle more complex agentic tasks. Superior coding performance: The model achieves higher scores on code benchmarks and demonstrates better real-world performance in applications such as Claude Code、Cline、Roo Code and Kilo Code, including improvements in generating visually polished front-end pages. Advanced reasoning: GLM-4.6 shows a clear improvement in reasoning performance and supports tool use during inference, leading to stronger overall capability. More capable agents: GLM-4.6 exhibits stronger performance in tool using and search-based agents, and integrates more effectively within agent frameworks. Refined writing: Better aligns with human preferences in style and readability, and performs more naturally in role-playing scenarios.
- Input / 1M tokens
- $0.390
- Output / 1M tokens
- $1.90
- Context window
- 205K tokens
- Provider
- Z AI
- Knowledge cutoff
- 2025-03-31
Performance
Median streaming throughput and first-token latency measured by Artificial Analysis.
- Output tokens / sec
- 67 t/s
- Time to first token
- 0.95s
Benchmarks
Intelligence, coding, and math indexes plus the underlying evaluation scores.
- Intelligence Index
- 30
- Coding Index
- 30
- Math Index
- 44
- MMLU-Pro
- 78.4%
- GPQA
- 63.2%
- HLE
- 5.2%
- LiveCodeBench
- 56.1%
- SciCode
- 33.1%
- MATH-500
- —
- AIME
- —
Benchmarks via Artificial Analysis