Qwen

Qwen3 30B A3B Thinking 2507

Qwen3-30B-A3B-Thinking-2507 is a 30B parameter Mixture-of-Experts reasoning model optimized for complex tasks requiring extended multi-step thinking. The model is designed specifically for “thinking mode,” where internal reasoning traces are separated from final answers. Compared to earlier Qwen3-30B releases, this version improves performance across logical reasoning, mathematics, science, coding, and multilingual benchmarks. It also demonstrates stronger instruction following, tool use, and alignment with human preferences. With higher reasoning efficiency and extended output budgets, it is best suited for advanced research, competitive problem solving, and agentic applications requiring structured long-context reasoning.

Input / 1M tokens: $0.080
Output / 1M tokens: $0.400
Context window: 131K tokens
Provider: Qwen
Cached input / 1M: $0.080
Knowledge cutoff: 2025-06-30

Performance

Median streaming throughput and first-token latency measured by Artificial Analysis.

Output tokens / sec: —
Time to first token: —