OpenAI
GPT-4.1 Nano
For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, and scores 80.1% on MMLU, 50.3% on GPQA, and 9.8% on Aider polyglot coding – even higher than GPT‑4o mini. It’s ideal for tasks like classification or autocompletion.
- Input / 1M tokens
- $0.100
- Output / 1M tokens
- $0.400
- Context window
- 1.0M tokens
- Provider
- OpenAI
- Cached input / 1M
- $0.025
- Knowledge cutoff
- 2024-06-30
Performance
Median streaming throughput and first-token latency measured by Artificial Analysis.
- Output tokens / sec
- —
- Time to first token
- —