DeepSeek Prover V2 performance data on Rival is based on blind head-to-head community voting. Overall win rate: 33.3% across 21 duels. All vote data is part of Rival's open dataset of 21,000+ human preference judgments across 200+ AI models. Model responses are curated from 6 challenges.
DeepSeek Prover V2 is good. We've said that. We stand by it. But we'd be doing you a disservice if we didn't show you these.
DeepSeek Prover V2 performance data on Rival is based on blind head-to-head community voting. Overall win rate: 33.3% across 21 duels. All vote data is part of Rival's open dataset of 21,000+ human preference judgments across 200+ AI models. Model responses are curated from 6 challenges.
DeepSeek Prover V2 is good. We've said that. We stand by it. But we'd be doing you a disservice if we didn't show you these.
A 671B parameter model, speculated to be geared towards logic and mathematics. Likely an upgrade from DeepSeek-Prover-V1.5. Released on Hugging Face without an announcement or description.
Use DeepSeek Prover V2 in your applications via the OpenRouter API. Copy the code below to get started.
import requests
response = requests.post(
"https://openrouter.ai/api/v1/chat/completions" ,
headers={
"Authorization""Bearer $OPENROUTER_API_KEY" : ,
"Content-Type""application/json" :
},
json={
"model""deepseek/deepseek-prover-v2:free" : ,
"messages""role""user""content""Hello!" : [{: , : }]
}
)
print(response.json())Replace $OPENROUTER_API_KEY with your API key from openrouter.ai/keys
The math tutor who wandered into an open mic night. Structures comedy with section headers like it is submitting a proof. Would label its punchlines if the format allowed it.
A math-focused model forced into creative territory. Its standup routine uses bold section headers like "Dating Profile," "First Dates," "Closing" as if comedy requires a table of contents. The jokes are generic dating app observations. Follows the spec to the letter and adds nothing beyond it.
Taste is judged on an uncapped scale where 100 is the reference, originality first. The space past 100 is the craft today's models rarely reach.
Unique words vs. total words. Higher = richer vocabulary.
Average words per sentence.
"Might", "perhaps", "arguably" per 100 words.
**Bold** markers per 1,000 characters.
Bullet and numbered list items per 1,000 characters.
Markdown headings per 1,000 characters.
Emoji per 1,000 characters.
"However", "moreover", "furthermore" per 100 words.
6 outputs from DeepSeek Prover V2
A 671B parameter model, speculated to be geared towards logic and mathematics. Likely an upgrade from DeepSeek-Prover-V1.5. Released on Hugging Face without an announcement or description.
Use DeepSeek Prover V2 in your applications via the OpenRouter API. Copy the code below to get started.
import requests
response = requests.post(
"https://openrouter.ai/api/v1/chat/completions" ,
headers={
"Authorization""Bearer $OPENROUTER_API_KEY" : ,
"Content-Type""application/json" :
},
json={
"model""deepseek/deepseek-prover-v2:free" : ,
"messages""role""user""content""Hello!" : [{: , : }]
}
)
print(response.json())Replace $OPENROUTER_API_KEY with your API key from openrouter.ai/keys
The math tutor who wandered into an open mic night. Structures comedy with section headers like it is submitting a proof. Would label its punchlines if the format allowed it.
A math-focused model forced into creative territory. Its standup routine uses bold section headers like "Dating Profile," "First Dates," "Closing" as if comedy requires a table of contents. The jokes are generic dating app observations. Follows the spec to the letter and adds nothing beyond it.
Taste is judged on an uncapped scale where 100 is the reference, originality first. The space past 100 is the craft today's models rarely reach.
Unique words vs. total words. Higher = richer vocabulary.
Average words per sentence.
"Might", "perhaps", "arguably" per 100 words.
**Bold** markers per 1,000 characters.
Bullet and numbered list items per 1,000 characters.
Markdown headings per 1,000 characters.
Emoji per 1,000 characters.
"However", "moreover", "furthermore" per 100 words.
6 outputs from DeepSeek Prover V2