AI
AI Nexus DailyYour Daily AI News
MiniMax Releases M3: First Open-Weight Model Combining 1-Million-Token Context with Native Multimodal Powers
Open Source

MiniMax Releases M3: First Open-Weight Model Combining 1-Million-Token Context with Native Multimodal Powers

MiniMax launches M3, the first open-weight model combining a 1-million-token context window, native multimodal capabilities, and frontier coding.

Shanghai-based artificial intelligence firm MiniMax has officially launched MiniMax M3, a new open-weight model featuring a one-million-token context window, native multimodal capabilities, and frontier-level coding performance. Released on June 1, 2026, the model marks a major architectural milestone by utilizing a novel sparse attention mechanism designed to drastically reduce the high operational costs typically associated with ultra-long-context AI systems.

A vertical timeline infographic showing the key milestones of MiniMax.
A vertical timeline infographic showing the key milestones of MiniMax.

The model represents a critical pivot for MiniMax, which went public on the Hong Kong Stock Exchange in January 2026. After deploying its previous M2 series using full attention mechanisms, the company has returned to a sparse attention framework with M3. This architectural shift addresses the core computational bottlenecks of context scaling, positioning the model as a highly competitive and economically viable option for developers.

The MiniMax Sparse Attention Architecture

At the core of M3’s performance is the MiniMax Sparse Attention (MSA) architecture. Traditional attention mechanisms suffer from quadratic computational complexity as context windows expand, making long-context processing prohibitively expensive. MSA resolves this issue by reducing the per-token computational load at a one-million-token context length to approximately 1/20th of what previous-generation models required.

A technical architectural diagram explaining 'MiniMax Sparse Attention (MSA)'
A technical architectural diagram explaining 'MiniMax Sparse Attention (MSA)'

This efficiency translates into substantial performance gains. According to company metrics, M3 achieves more than a 9x speedup in prefilling speeds and a 15x improvement in decoding speeds compared to its predecessors. MiniMax engineering lead Skyler Miao indicated that the introduction of MSA is specifically designed to make the deployment of ultra-long-context AI agents economically viable through these significant speedups. By easing the compute bottleneck, MiniMax aims to lower the operational threshold for enterprises deploying complex, persistent AI workflows.

Frontier Performance on Coding and Agentic Tasks

MiniMax has positioned M3 as a direct competitor to some of the industry’s leading proprietary models. On the SWE-Bench Pro benchmark, which measures an AI's ability to solve software engineering problems in real-world codebases, MiniMax M3 achieved a score of 59.0%. This puts it slightly ahead of Google’s Gemini 3.1 Pro (54.2%) and OpenAI’s GPT-5.5 (58.6%), while closely trailing Anthropic's Claude Opus 4.7.

A bar chart infographic comparing coding and browsing benchmarks
A bar chart infographic comparing coding and browsing benchmarks

In addition to software development, M3 demonstrates strong capabilities in autonomous web browsing and terminal execution. The model achieved a score of 83.5% on the BrowseComp benchmark for web search agents, outperforming Claude Opus 4.7, which scored 79.3%. Furthermore, M3 scored 66.0% on Terminal-Bench 2.1, a benchmark designed to evaluate how effectively AI agents execute commands and complete tasks within terminal environments.

However, independent industry analysts note that while M3 compares favorably to Claude Opus 4.7, Anthropic has since released its Claude Opus 4.8 update. Early reports suggest that Opus 4.8 exhibits stronger capabilities in several comparable agentic evaluations where M3 still trails. Additionally, because the benchmark figures for M3 are company-provided, the broader AI research community is waiting for independent verification of these metrics.

Deep Multimodal Integration

Unlike the previous MiniMax M1 model released in June 2025—which also offered a one-million-token context window but focused strictly on text-based reasoning and tool use—M3 is natively multimodal from the ground up. The model was trained on an expansive dataset of approximately 100 trillion tokens, with text and images interleaved from the very beginning of the training process.

This deep alignment allows M3 to process complex visual data alongside massive volumes of text. The model supports both image and video inputs and is capable of interacting directly with a desktop computer operating system. This functionality places M3 at the forefront of the emerging class of "OS Agents"—AI systems capable of automating multi-step workflows across various computer interfaces by observing screen outputs and executing precise keyboard and mouse actions.

An illustration of an AI agent operating a computer desktop.
An illustration of an AI agent operating a computer desktop.

Pricing and the Promise of Open Weights

MiniMax is making M3 available through its proprietary API, MiniMax Code, and Token Plan services. To encourage developer adoption, the company has introduced promotional pricing starting at $0.30 per million input tokens and $1.20 per million output tokens. This aggressive pricing strategy, combined with the computational efficiency of MSA, could significantly disrupt the economics of deploying commercial-grade AI agents.

Perhaps the most significant aspect of the release is MiniMax's commitment to the open-weight community. The company has announced plans to release the model weights and a comprehensive technical report on Hugging Face and GitHub within 10 days of the launch, setting a target date of June 11, 2026.

If MiniMax delivers on this promise, M3 will become the first open-weight model to offer a one-million-token context window alongside frontier coding and native multimodal capabilities. This open-source availability could democratize access to advanced agentic workflows, allowing researchers and independent developers to build highly complex, cost-effective AI applications without being locked into proprietary ecosystems. However, until the weights are officially published, the AI community remains cautiously optimistic, waiting to verify if the model's performance and open-weight status fully live up to the launch-day claims.