# Inception: Mercury 2: AI model fact sheet

- **Provider:** inception
- **Released:** 2026-03-04
- **Context window:** 128,000 tokens
- **API pricing:** $0.25 / 1M input, $0.75 / 1M output
- **OpenRouter ID:** inception/mercury-2
- **Capabilities:** conversation, reasoning, code-generation, analysis, tool-use

Mercury 2 is an extremely fast reasoning LLM and the first reasoning diffusion LLM (dLLM). Instead of generating tokens sequentially, Mercury 2 produces and refines multiple tokens in parallel, achieving over 1000 tokens per second on standard GPUs. Mercury 2 is 5x+ faster than leading speed-optimized LLMs like Claude 4.5 Haiku and GPT 5 Mini, at a fraction of the cost. Mercury 2 supports tunable reasoning levels, 128K context, native tool use, and schema-aligned JSON output.

Source: real side-by-side outputs, pricing and specs at https://rival.tips/models/mercury-2