Kimi Linear 48B A3B Instruct performance data on Rival is based on blind head-to-head community voting. Overall win rate: 23.1% across 26 duels. All vote data is part of Rival's open dataset of 21,000+ human preference judgments across 200+ AI models. Model responses are curated from 35 challenges.
We're not suggesting you leave Kimi Linear 48B A3B Instruct. We're just... putting these here. In case you're curious. Which you are, because you scrolled this far.
Kimi Linear 48B A3B Instruct performance data on Rival is based on blind head-to-head community voting. Overall win rate: 23.1% across 26 duels. All vote data is part of Rival's open dataset of 21,000+ human preference judgments across 200+ AI models. Model responses are curated from 35 challenges.
We're not suggesting you leave Kimi Linear 48B A3B Instruct. We're just... putting these here. In case you're curious. Which you are, because you scrolled this far.
Kimi Linear is a hybrid linear attention architecture that outperforms traditional full attention methods. Features Kimi Delta Attention (KDA) for efficient memory usage, reducing KV caches by up to 75% and boosting throughput by up to 6x for contexts as long as 1M tokens.
Use Kimi Linear 48B A3B Instruct in your applications via the OpenRouter API. Copy the code below to get started.
import requests
response = requests.post(
"https://openrouter.ai/api/v1/chat/completions" ,
headers={
"Authorization""Bearer $OPENROUTER_API_KEY" : ,
"Content-Type""application/json" :
},
json={
"model""moonshotai/kimi-linear-48b-a3b-instruct" : ,
"messages""role""user""content""Hello!" : [{: , : }]
}
)
print(response.json())Replace $OPENROUTER_API_KEY with your API key from openrouter.ai/keys
Unique words vs. total words. Higher = richer vocabulary.
Average words per sentence.
"Might", "perhaps", "arguably" per 100 words.
**Bold** markers per 1,000 characters.
Bullet and numbered list items per 1,000 characters.
Markdown headings per 1,000 characters.
Emoji per 1,000 characters.
"However", "moreover", "furthermore" per 100 words.
35 outputs from Kimi Linear 48B A3B Instruct
Kimi Linear is a hybrid linear attention architecture that outperforms traditional full attention methods. Features Kimi Delta Attention (KDA) for efficient memory usage, reducing KV caches by up to 75% and boosting throughput by up to 6x for contexts as long as 1M tokens.
Use Kimi Linear 48B A3B Instruct in your applications via the OpenRouter API. Copy the code below to get started.
import requests
response = requests.post(
"https://openrouter.ai/api/v1/chat/completions" ,
headers={
"Authorization""Bearer $OPENROUTER_API_KEY" : ,
"Content-Type""application/json" :
},
json={
"model""moonshotai/kimi-linear-48b-a3b-instruct" : ,
"messages""role""user""content""Hello!" : [{: , : }]
}
)
print(response.json())Replace $OPENROUTER_API_KEY with your API key from openrouter.ai/keys
Unique words vs. total words. Higher = richer vocabulary.
Average words per sentence.
"Might", "perhaps", "arguably" per 100 words.
**Bold** markers per 1,000 characters.
Bullet and numbered list items per 1,000 characters.
Markdown headings per 1,000 characters.
Emoji per 1,000 characters.
"However", "moreover", "furthermore" per 100 words.
35 outputs from Kimi Linear 48B A3B Instruct