# MiMo-V2-Omni: AI model fact sheet

- **Provider:** xiaomi
- **Released:** 2026-03-18
- **Context window:** 262,144 tokens
- **API pricing:** $0.40 / 1M input, $2.00 / 1M output
- **OpenRouter ID:** xiaomi/mimo-v2-omni
- **Capabilities:** conversation, reasoning, code-generation, analysis, agentic-tool-use

MiMo-V2-Omni is a frontier omni-modal model that natively processes image, video, and audio inputs within a unified architecture. It combines strong multimodal perception with agentic capability including visual grounding, multi-step planning, tool use, and code execution, making it well-suited for complex real-world tasks that span modalities.

Source: real side-by-side outputs, pricing and specs at https://rival.tips/models/mimo-v2-omni