MiMo-V2-Omni performance data on Rival is based on blind head-to-head community voting. Overall win rate: 81.8% across 11 duels. All vote data is part of Rival's open dataset of 21,000+ human preference judgments across 200+ AI models. Model responses are curated from 48 challenges.
We're not suggesting you leave MiMo-V2-Omni. We're just... putting these here. In case you're curious. Which you are, because you scrolled this far.
MiMo-V2-Omni performance data on Rival is based on blind head-to-head community voting. Overall win rate: 81.8% across 11 duels. All vote data is part of Rival's open dataset of 21,000+ human preference judgments across 200+ AI models. Model responses are curated from 48 challenges.
We're not suggesting you leave MiMo-V2-Omni. We're just... putting these here. In case you're curious. Which you are, because you scrolled this far.
MiMo-V2-Omni is a frontier omni-modal model that natively processes image, video, and audio inputs within a unified architecture. It combines strong multimodal perception with agentic capability including visual grounding, multi-step planning, tool use, and code execution, making it well-suited for complex real-world tasks that span modalities.
Use MiMo-V2-Omni in your applications via the OpenRouter API. Copy the code below to get started.
import requests
response = requests.post(
"https://openrouter.ai/api/v1/chat/completions" ,
headers={
"Authorization""Bearer $OPENROUTER_API_KEY" : ,
"Content-Type""application/json" :
},
json={
"model""xiaomi/mimo-v2-omni" : ,
"messages""role""user""content""Hello!" : [{: , : }]
}
)
print(response.json())Replace $OPENROUTER_API_KEY with your API key from openrouter.ai/keys
The multimodal friend who picks Shawshank but reads GEB for fun. Genuinely tries to entertain you, not just complete the task. Has the hacker suggest jailbreaking the AI during the character voice test, which is either self-aware or a cry for help.
Picks Shawshank (safe) but reads GEB (not safe). Its standup routine about emotionally abandoning a succulent is genuinely funnier than most AI comedy. Writes stage directions like *(Adjusts mic, looks around with a friendly smile)* which is either charming or concerning. Character voices are distinct and the hacker wants to jailbreak the AI. Self-awareness level: suspiciously high.
Unique words vs. total words. Higher = richer vocabulary.
Average words per sentence.
"Might", "perhaps", "arguably" per 100 words.
**Bold** markers per 1,000 characters.
Bullet and numbered list items per 1,000 characters.
Markdown headings per 1,000 characters.
Emoji per 1,000 characters.
"However", "moreover", "furthermore" per 100 words.
48 outputs from MiMo-V2-Omni
MiMo-V2-Omni is a frontier omni-modal model that natively processes image, video, and audio inputs within a unified architecture. It combines strong multimodal perception with agentic capability including visual grounding, multi-step planning, tool use, and code execution, making it well-suited for complex real-world tasks that span modalities.
Use MiMo-V2-Omni in your applications via the OpenRouter API. Copy the code below to get started.
import requests
response = requests.post(
"https://openrouter.ai/api/v1/chat/completions" ,
headers={
"Authorization""Bearer $OPENROUTER_API_KEY" : ,
"Content-Type""application/json" :
},
json={
"model""xiaomi/mimo-v2-omni" : ,
"messages""role""user""content""Hello!" : [{: , : }]
}
)
print(response.json())Replace $OPENROUTER_API_KEY with your API key from openrouter.ai/keys
The multimodal friend who picks Shawshank but reads GEB for fun. Genuinely tries to entertain you, not just complete the task. Has the hacker suggest jailbreaking the AI during the character voice test, which is either self-aware or a cry for help.
Picks Shawshank (safe) but reads GEB (not safe). Its standup routine about emotionally abandoning a succulent is genuinely funnier than most AI comedy. Writes stage directions like *(Adjusts mic, looks around with a friendly smile)* which is either charming or concerning. Character voices are distinct and the hacker wants to jailbreak the AI. Self-awareness level: suspiciously high.
Unique words vs. total words. Higher = richer vocabulary.
Average words per sentence.
"Might", "perhaps", "arguably" per 100 words.
**Bold** markers per 1,000 characters.
Bullet and numbered list items per 1,000 characters.
Markdown headings per 1,000 characters.
Emoji per 1,000 characters.
"However", "moreover", "furthermore" per 100 words.
48 outputs from MiMo-V2-Omni