Xiaomi’s MiMo Code Is an Open-Source Claude Code Challenger That Wins at 200-Step Tasks

June 15, 2026
10:08 am

Xiaomi’s MiMo AI team open-sourced MiMo Code V0.1.0 on June 10, 2026 — a terminal coding agent that scores 82% on SWE-bench Verified versus Claude Code’s 79%, and wins more than 65% of head-to-head matchups past 200 execution steps. The edge is not from a better model. It comes from a persistent cross-session memory system that refuses to lose context when tasks get long. It installs with a single command and currently offers free access to a 1M-token multimodal model.

What MiMo Code Is

MiMo Code is a terminal-native AI coding agent released under the MIT license by Xiaomi’s MiMo AI division. It is a fork of the open-source OpenCode agent, extended with Xiaomi’s memory architecture, workflow modes, and model harness. It runs on macOS, Linux, and Windows.

Install on macOS/Linux:

curl -fsSL https://mimo.xiaomi.com/install | bash

Install on Windows via npm:

npm install -g @mimo-ai/cli

Zero configuration required. On first launch, MiMo Code connects to “MiMo Auto” — Xiaomi’s free-for-now channel backed by MiMo-V2.5, a 310 billion parameter sparse mixture-of-experts model with a 1 million token context window. It also auto-imports MCP servers, custom skills, and API configs from an existing Claude Code installation — so switching is frictionless if you have an existing setup. For developers building with MCP today, there is no migration step required.

The Memory Architecture Is the Real Story

Every agentic coding tool degrades over long sessions. Context windows fill up, earlier decisions get compressed away, and developers end up re-explaining projects they already explained an hour ago. MiMo Code’s core argument is that compression is the wrong solution. You need explicit storage and retrieval instead.

The system uses four persistent layers running simultaneously:

Project memory: A persistent MEMORY.md file in your repo that survives across sessions
Session checkpoints: Structured snapshots of task state at key decision points
Scratch notes: Ephemeral working memory used during active task execution
Per-task progress logs: Decision history that can be queried when context gets tight

A dedicated “checkpoint-writer” subagent handles all of this while the primary coding agent keeps building. No interruptions, no pausing. When the primary agent’s context window fills, it queries the checkpoint-writer to reconstruct its environment from structured state and picks up exactly where it left off. This is the mechanism Xiaomi credits for the performance gap at 200+ steps.

A /dream command runs roughly every seven days to review historical sessions, deduplicate them, and compress repeated workflows into long-term memory — a similar approach to what OpenAI uses for ChatGPT memory and what Anthropic has implemented for Claude agents.

Benchmark Numbers

Xiaomi tested MiMo Code paired with MiMo-V2.5-Pro against Claude Code running Claude Sonnet 4.6 across three evaluations:

Benchmark	MiMo Code	Claude Code
SWE-bench Verified	82%	79%
SWE-bench Pro	62%	55%
Terminal Bench 2	73%	69%

The harness itself accounts for a measurable share of those gains. Running the same MiMo-V2.5-Pro model through both harnesses, MiMo Code scores 62% versus Claude Code’s 57% on SWE-bench Pro, and 73% versus 68% on Terminal Bench 2 — roughly five percentage points per benchmark attributable purely to the agent system, not the model.

In a human double-blind A/B evaluation involving 576 developers, 474 private repositories, and 1,213 judged head-to-head task pairs: below 200 execution steps, the two tools split roughly 50/50. Past 200 steps, MiMo Code’s win rate climbs above 65%. That threshold is exactly where Xiaomi’s memory system starts paying off.

Important caveats: these are vendor self-reported numbers. MiMo Code does not yet appear on the official Terminal-Bench 2.0 or SWE-bench leaderboards. For context, OpenAI’s Codex CLI running GPT-5.5 scores 82.2% on Terminal-Bench 2.0 officially — about nine points above MiMo Code’s claimed 73%. The competitive picture is more complex than Xiaomi’s own benchmarks suggest.

Pricing: Aggressively Below Claude Code

MiMo-V2.5 access is currently free through MiMo Auto. When paid pricing kicks in, the rates are among the lowest available for frontier-class agentic models:

Model	Input $/1M tokens	Output $/1M tokens	Notes
MiMo-V2.5	$0.40	$2.00	Default MiMo Code model (currently free)
MiMo-V2.5-Pro (≤256K)	$1.00	$3.00	Higher-end tasks; 1T param MoE model
Claude Opus 4.8	$5.00	$25.00	Default Claude Code model
GPT-5.5	$5.00	$30.00	OpenAI Codex CLI
Claude Fable 5 / Mythos	$10.00	$50.00	Top-tier Anthropic models

For developers who don’t want Xiaomi’s models, MiMo Code also supports any OpenAI-compatible API endpoint, plus DeepSeek, Kimi, and Zhipu’s GLM backends. It functions as a bring-your-own-model agent harness pointing at whatever inference provider you approve.

Three Features Worth Knowing

Compose mode: Press Tab to enter a specification-driven workflow. You describe a high-level goal; the system autonomously runs the full development cycle — design, planning, coding, testing, and review — end to end.
Voice control: Built on Xiaomi’s MiMo-ASR speech recognition with TenVAD voice activity detection. You can dictate instructions verbally and say “send” or “execute” for fully hands-free operation. Requires a logged-in account.
Claude Code migration: MiMo Code auto-imports your existing MCP servers, custom skills, and API configurations from Claude Code. If you’re building with the Claude Code CLI today, there is no manual migration step.

What to Watch Before You Switch

Three real concerns alongside the strong benchmark story:

V0.1.0 maturity: The version number signals exactly what it suggests. Expect rough edges, missing features, and breaking changes.
Free model routes through Xiaomi’s servers: Your code context travels to Xiaomi’s inference infrastructure during the free tier. If you work with proprietary codebases or operate under data residency requirements, check the terms before connecting.
Self-reported benchmarks: Neither MiMo Code nor the underlying MiMo-V2.5-Pro appear on official SWE-bench or Terminal-Bench leaderboards yet. The head-to-head configuration choices matter significantly for these kinds of comparisons.

The Bottom Line

MiMo Code is the first non-Anthropic terminal coding agent with benchmark numbers that genuinely challenge Claude Code rather than just claiming to. The persistent memory architecture addresses a problem that every developer using AI coding tools on long sessions knows well — context drift on complex, multi-hour tasks. The free model access drops the evaluation cost to zero.

The caveats are real and worth taking seriously: V0.1.0 maturity, self-reported benchmarks, and data routing through Xiaomi’s servers. For developers managing proprietary code, the bring-your-own-model option (pointing MiMo Code at your own API endpoint) is the safer path. For everyone else, there is no good reason not to spend an afternoon testing it against your current Claude Code setup. If you want to compare multiple AI coding agents side by side without switching terminals, Bind AI’s IDE lets you run the same task across models and see the difference directly.

The AI workspace that turns prompts into results.

Plan, research, and ship faster with AI that understands your work.

From PRD to production before the week is over. Build with Friday AI

Available on:

tryfriday.ai

product_team_goals:

time_to_market: "shipped_in_hours"

dev_alignment: "prds_to_clean_code"

overhead: "zero_waste_meetings"

sprint_status: features_deployed_successfully...