MiniMax M3

Budget

MiniMax M3 launched June 1, 2026 as an open-weight coding and agentic model built on MiniMax Sparse Attention, which swaps full attention for KV-block selection to cut long-context compute to roughly one twentieth of the previous generation at 1M tokens. It accepts text, image, and video input with a 1,048,576 token context window and up to 512K output tokens. MiniMax reports 59% on SWE-Bench Pro and 83.5 on BrowseComp, though several results were run on MiniMax infrastructure with agent scaffolding, so independent verification is still pending. API pricing is $0.30 per million input and $1.20 per million output. Weights and a technical report are due on Hugging Face within about ten days of launch.

Input Price

$0.30

per 1M tokens

Output Price

$1.20

per 1M tokens

Context Window

1.0M

tokens

Released

2026-06

Open source

Capabilities

textvisionvideocodetool-use

Key Strengths

✓1M token context window
✓Very low pricing ($0.30/$1.20)
✓Sparse attention cuts long-context cost roughly 20x
✓Multimodal input (text, image, video)
✓Open weights promised within days of launch

Best For

▸Budget agentic coding
▸Long-context repository analysis
▸Browser and tool-use agents
▸Self-hosted inference once weights land

Pricing Details

Input tokens

$0.30

per 1M tokens

Output tokens

$1.20

per 1M tokens

Estimated cost per 1K requests

$0.90

~1K input + ~500 output tokens avg

Prices are subject to change. Check the official documentation for current pricing. See the cost calculator for detailed estimates.

Open Source Model

MiniMax M3 is free to download and self-host under the Open weights, license TBD. Hosted API pricing varies by provider (e.g., Together, Fireworks, Groq). See our open source LLM guide for deployment options.