Moonshot AI AI Models: All 7 Compared

Kimi K2.7 Code

Jun 2026

Kimi K2.7 Code is a coding-focused model in Moonshot AI's Kimi K2 family, built to complete end-to-end programming tasks reliably over long contexts. It uses a native multimodal mixture-of-experts architecture that accepts text, image and video input, and it always operates in a thinking mode, preserving full reasoning content across multi-turn conversations. With a 256K-token context window, it targets long-horizon coding, agentic task decomposition, and multi-turn dialogue. The model activates 32B parameters out of roughly 1T total.

conversationreasoningcode-generationanalysisagentic-tool-usetool-use

Kimi K2.6

Apr 2026

Kimi K2.6 is Moonshot AI's next-generation multimodal model, designed for long-horizon coding, coding-driven UI/UX generation, and multi-agent orchestration. It handles complex end-to-end coding tasks across Python, Rust, and Go, and can convert prompts and visual inputs into production-ready interfaces. Its agent swarm architecture scales to hundreds of parallel sub-agents for autonomous task decomposition, delivering documents, websites, and spreadsheets in a single run without human oversight.

conversationreasoningcode-generationanalysisagentic-tool-useweb-design

Kimi K2.5

Jan 2026

Kimi K2.5 is Moonshot AI's native multimodal model, delivering state-of-the-art visual coding capability and a self-directed agent swarm paradigm. Built on Kimi K2 with continued pretraining over approximately 15T mixed visual and text tokens, it delivers strong performance in general reasoning, visual coding, and agentic tool-calling.

conversationreasoningcode-generationanalysis

Kimi Linear 48B A3B Instruct

Nov 2025

Kimi Linear is a hybrid linear attention architecture that outperforms traditional full attention methods. Features Kimi Delta Attention (KDA) for efficient memory usage, reducing KV caches by up to 75% and boosting throughput by up to 6x for contexts as long as 1M tokens.

conversationreasoningcode-generationanalysis

Kimi K2 Thinking

Nov 2025

Kimi K2 Thinking is Moonshot AI's most advanced open reasoning model to date, extending the K2 series into agentic, long-horizon reasoning. Built on the trillion-parameter Mixture-of-Experts (MoE) architecture introduced in Kimi K2, it activates 32 billion parameters per forward pass and supports 256K-token context windows. The model is optimized for persistent step-by-step thought, dynamic tool invocation, and complex reasoning workflows that span hundreds of turns. It interleaves step-by-step reasoning with tool use, enabling autonomous research, coding, and writing that can persist for hundreds of sequential actions without drift.

conversationreasoningcode-generationanalysistool-use

MoonshotAI: Kimi K2 0905

Sep 2025

Kimi K2 0905 is the September update of Kimi K2 0711. It is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion total parameters with 32 billion active per forward pass. It supports long-context inference up to 256k tokens, extended from the previous 128k. This update improves agentic coding with higher accuracy and better generalization across scaffolds, and enhances frontend coding with more aesthetic and functional outputs for web, 3D, and related tasks. Kimi K2 is optimized for agentic capabilities, including advanced tool use, reasoning, and code synthesis. It excels across coding (LiveCodeBench, SWE-bench), reasoning (ZebraLogic, GPQA), and tool-use (Tau2, AceBench) benchmarks. The model is trained with a novel stack incorporating the MuonClip optimizer for stable large-scale MoE training.

conversationreasoningcode-generationanalysistool-use

Kimi K2

Jul 2025

Kimi K2 is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion total parameters with 32 billion active per forward pass. It is optimized for agentic capabilities, including advanced tool use, reasoning, and code synthesis. Kimi K2 excels across a broad range of benchmarks, particularly in coding (LiveCodeBench, SWE-bench), reasoning (ZebraLogic, GPQA), and tool-use (Tau2, AceBench) tasks. It supports long-context inference up to 128K tokens and is designed with a novel training stack that includes the MuonClip optimizer for stable large-scale MoE training.

conversationreasoningcode-generationanalysis

Model Evolution

Model Evolution

Moonshot AI

Kimi K2.7 Code

Kimi K2.6

Kimi K2.5

Kimi Linear 48B A3B Instruct

Kimi K2 Thinking

MoonshotAI: Kimi K2 0905

Kimi K2

Model Evolution

Moonshot AI

Kimi K2.7 Code

Kimi K2.6

Kimi K2.5

Kimi Linear 48B A3B Instruct

Kimi K2 Thinking

MoonshotAI: Kimi K2 0905

Kimi K2

Model Evolution

Moonshot AI

Kimi K2.7 Code

Kimi K2.6

Kimi K2.5

Kimi Linear 48B A3B Instruct

Kimi K2 Thinking

MoonshotAI: Kimi K2 0905

Kimi K2

Moonshot AI

Kimi K2.7 Code

Kimi K2.6

Kimi K2.5

Kimi Linear 48B A3B Instruct

Kimi K2 Thinking

MoonshotAI: Kimi K2 0905

Kimi K2