MiniMax M3
Budgetby MiniMax
MiniMax M3 launched June 1, 2026 as an open-weight coding and agentic model built on MiniMax Sparse Attention, which swaps full attention for KV-block selection to cut long-context compute to roughly one twentieth of the previous generation at 1M tokens. It accepts text, image, and video input with a 1,048,576 token context window and up to 512K output tokens. MiniMax reports 59% on SWE-Bench Pro and 83.5 on BrowseComp, though several results were run on MiniMax infrastructure with agent scaffolding, so independent verification is still pending. API pricing is $0.30 per million input and $1.20 per million output. Weights and a technical report are due on Hugging Face within about ten days of launch.
Input Price
$0.30
per 1M tokens
Output Price
$1.20
per 1M tokens
Context Window
1.0M
tokens
Released
2026-06
Open source
Capabilities
Key Strengths
- ✓1M token context window
- ✓Very low pricing ($0.30/$1.20)
- ✓Sparse attention cuts long-context cost roughly 20x
- ✓Multimodal input (text, image, video)
- ✓Open weights promised within days of launch
Best For
- ▸Budget agentic coding
- ▸Long-context repository analysis
- ▸Browser and tool-use agents
- ▸Self-hosted inference once weights land
Pricing Details
Input tokens
$0.30
per 1M tokens
Output tokens
$1.20
per 1M tokens
Estimated cost per 1K requests
$0.90
~1K input + ~500 output tokens avg
Prices are subject to change. Check the official documentation for current pricing. See the cost calculator for detailed estimates.
Open Source Model
MiniMax M3 is free to download and self-host under the Open weights, license TBD. Hosted API pricing varies by provider (e.g., Together, Fireworks, Groq). See our open source LLM guide for deployment options.