On June 1, 2026, MiniMax released M3 — and it quietly achieved something no open-weight model has done before: combining frontier-level coding, a 1-million token context window, and native multimodality in a single model you can download and run yourself.
It also costs $0.60 per million input tokens. For comparison, Claude Opus 4.8 costs $5/M. GPT-5.5 costs even more.
Here's the full picture.
What Is MiniMax M3?
MiniMax is a Chinese AI company that's been building multimodal models since 2021. M3 is their most ambitious release — an open-weight model designed to compete directly with proprietary frontier models.
Key specs:
- Context window: 1,000,000 tokens
- Architecture: Sparse attention (new MSA design)
- Modalities: Text, images, video input
- Weights: Published on Hugging Face and GitHub
- API pricing: $0.60/M input, $1.80/M output
Why the Architecture Matters
Most 1M-token models get very slow at long contexts. MiniMax's sparse attention mechanism cuts computational requirements to as little as 1/20th of previous approaches — achieving 15.6× faster decoding speed at 1M tokens compared to standard attention.
In practice, this means M3 doesn't slow to a crawl when you feed it a full codebase or a 500-page document. The speed stays practical.
Benchmark Performance
| Benchmark | MiniMax M3 | GPT-5.5 | Claude Opus 4.7 | Gemini 3.1 Pro | |-----------|-----------|---------|----------------|----------------| | SWE-Bench Pro | 59.0% | 63.1% | 64.3% | 61.8% | | Terminal-Bench 2.1 | 66.0% | 62.4% | 65.1% | 61.2% | | BrowseComp | 83.5 | 79.2 | 81.4 | 80.1 |
M3 beats GPT-5.5 on SWE-Bench Pro and Terminal-Bench. It trails Claude Opus 4.7 slightly on SWE-Bench but wins on Terminal-Bench — impressive for an open-weight model at $0.60/M tokens.
The Three Firsts
MiniMax claims M3 is the first and only open-weight model to combine all three of:
- Frontier-tier software engineering (59% SWE-Bench Pro)
- 1M token context window with practical speed
- Native multimodality (images + video, not just text)
Each of these exists individually in other open models. None has had all three until M3.
What "Open Weight" Actually Means Here
Worth being clear: M3 is open weight, not fully open source. MiniMax has released the model weights under an open license, but has not released:
- Training code
- Inference operators
- Full training data details
For most use cases (self-hosting, fine-tuning, research), this doesn't matter. But it's not identical to models like Llama which are more fully open.
Pricing Comparison
| Model | Input ($/1M) | Output ($/1M) | Context | Open Weight | |-------|-------------|--------------|---------|-------------| | MiniMax M3 | $0.60 | $1.80 | 1M | ✅ | | GPT-5.5 | $7.50 | $30.00 | 128k | ❌ | | Claude Opus 4.8 | $5.00 | $25.00 | 200k | ❌ | | Gemini 3.1 Pro | $3.50 | $10.50 | 2M | ❌ | | Kimi k2 | $0.55 | $2.20 | 1M | ❌ |
M3 is roughly 12× cheaper than GPT-5.5 on input tokens while matching or beating it on several key benchmarks.
Best Use Cases for MiniMax M3
1. High-volume API applications The price gap is enormous. If you're making millions of API calls, switching from GPT-5.5 to M3 for eligible tasks could cut costs by 90%+.
2. Long document analysis 1M context + fast sparse attention = ideal for legal documents, financial reports, academic papers, and codebases.
3. Self-hosted deployment Teams with data privacy requirements can run M3 on their own infrastructure. Not possible with GPT or Claude.
4. Research and fine-tuning Open weights enable fine-tuning on domain-specific data — something proprietary APIs don't allow.
5. Multimodal pipelines Native image and video understanding without needing a separate vision model.
Limitations
- English reasoning slightly below Claude Opus 4.8
- Training code not fully public (can't reproduce from scratch)
- Smaller ecosystem than OpenAI/Anthropic
- Limited official tutorials and documentation vs. GPT/Claude
Verdict
MiniMax M3 is the most important open-weight AI release of 2026. The combination of frontier coding performance, 1M context, multimodality, and $0.60/M pricing is unprecedented.
For enterprise teams running high-volume workloads, the cost savings alone justify serious evaluation. For researchers wanting open-weight access to frontier-level capability, it's immediately the best option available.
It doesn't topple Claude Opus 4.8 overall. But at 1/8th the price, it doesn't need to.
Rating: 9.0/10
Published June 17, 2026.
Sources: