Minimax

MiniMax M2

MiniMax-M2 is a compact, high-efficiency large language model optimized for end-to-end coding and agentic workflows. With 10 billion activated parameters (230 billion total), it delivers near-frontier intelligence across general reasoning, tool use, and multi-step task execution while maintaining low latency and deployment efficiency. The model excels in code generation, multi-file editing, compile-run-fix loops, and test-validated repair, showing strong results on SWE-Bench Verified, Multi-SWE-Bench, and Terminal-Bench. It also performs competitively in agentic evaluations such as BrowseComp and GAIA, effectively handling long-horizon planning, retrieval, and recovery from execution errors. Benchmarked by [Artificial Analysis](https://artificialanalysis.ai/models/minimax-m2), MiniMax-M2 ranks among the top open-source models for composite intelligence, spanning mathematics, science, and instruction-following. Its small activation footprint enables fast inference, high concurrency, and improved unit economics, making it well-suited for large-scale agents, developer assistants, and reasoning-driven applications that require responsiveness and cost efficiency. To avoid degrading this model's performance, MiniMax highly recommends preserving reasoning between turns. Learn more about using reasoning_details to pass back reasoning in our [docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#preserving-reasoning-blocks).

Input / 1M tokens: $0.255
Output / 1M tokens: $1.00
Context window: 197K tokens
Provider: Minimax
Cached input / 1M: $0.030

Performance

Median streaming throughput and first-token latency measured by Artificial Analysis.

Output tokens / sec: 64 t/s
Time to first token: 2.25s

Benchmarks

Intelligence, coding, and math indexes plus the underlying evaluation scores.

Intelligence Index: 36
Coding Index: 29
Math Index: 78
MMLU-Pro: 82.0%
GPQA: 77.7%
HLE: 12.5%
LiveCodeBench: 82.6%
SciCode: 36.1%
MATH-500: —
AIME: —

Benchmarks via Artificial Analysis