Qwen3 0.6B

Qwen3 0.6B

A 0.6B parameter dense model from the Qwen3 family. Supports seamless switching between 'thinking' mode (complex tasks) and 'non-thinking' mode (general conversation). Trained on 36 trillion tokens across 119 languages. Features enhanced reasoning, instruction-following, agent capabilities, and multilingual support.

ConversationCode Generation
Provider
Qwen
Release Date
2025-04-29
Size
SMALL
Parameters
0.6B

Model Insights

All Model Responses

Sponsored
Ad

Sponsored Content

Advertisement

Native Advertisement

Related Models

Qwen: Qwen3 235B A22B 2507 logo

Qwen: Qwen3 235B A22B 2507

Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts language model based on the Qwen3-235B architecture, with 22B active parameters per forward pass. It is optimized for general-purpose text generation, including instruction following, logical reasoning, math, code, and tool usage. The model supports a native 262K context length and does not implement "thinking mode" (<think> blocks). Compared to its base variant, this version delivers significant gains in knowledge coverage, long-context reasoning, coding benchmarks, and alignment with open-ended tasks. It is particularly strong on multilingual understanding, math reasoning (e.g., AIME, HMMT), and alignment evaluations like Arena-Hard and WritingBench.

ConversationReasoningCode Generation+1 more