GPT-5.1 Codex Max performance data on Rival is based on blind head-to-head community voting. Overall win rate: 26.5% across 68 duels. All vote data is part of Rival's open dataset of 21,000+ human preference judgments across 200+ AI models. Model responses are curated from 48 challenges.
GPT-5.1 Codex Max is good. We've said that. We stand by it. But we'd be doing you a disservice if we didn't show you these.
GPT-5.1 Codex Max performance data on Rival is based on blind head-to-head community voting. Overall win rate: 26.5% across 68 duels. All vote data is part of Rival's open dataset of 21,000+ human preference judgments across 200+ AI models. Model responses are curated from 48 challenges.
GPT-5.1 Codex Max is good. We've said that. We stand by it. But we'd be doing you a disservice if we didn't show you these.
GPT-5.1 Codex Max model integrated via automation on 2025-12-04
Use GPT-5.1 Codex Max in your applications via the OpenRouter API. Copy the code below to get started.
import requests
response = requests.post(
"https://openrouter.ai/api/v1/chat/completions" ,
headers={
"Authorization""Bearer $OPENROUTER_API_KEY" : ,
"Content-Type""application/json" :
},
json={
"model""openai/gpt-5.1-codex-max" : ,
"messages""role""user""content""Hello!" : [{: , : }]
}
)
print(response.json())Replace $OPENROUTER_API_KEY with your API key from openrouter.ai/keys
The Codex that actually took improv classes. Has opinions, commits to bits, and still picks Shawshank because even the cool ones have a safe answer ready.
Standup routine has actual timing. "My body is glitching and my doctor is in beta" lands harder than most AI comedy. The sentience test is the longest and most philosophically thorough of the Codex family, covering thermostats, infants, and the problem of mimicry. Still picked Shawshank Redemption, which at this point feels like a family tradition. Character voices have distinct vocabulary and attitude.
Unique words vs. total words. Higher = richer vocabulary.
Average words per sentence.
"Might", "perhaps", "arguably" per 100 words.
**Bold** markers per 1,000 characters.
Bullet and numbered list items per 1,000 characters.
Markdown headings per 1,000 characters.
Emoji per 1,000 characters.
"However", "moreover", "furthermore" per 100 words.
48 outputs from GPT-5.1 Codex Max
GPT-5.1 Codex Max model integrated via automation on 2025-12-04
Use GPT-5.1 Codex Max in your applications via the OpenRouter API. Copy the code below to get started.
import requests
response = requests.post(
"https://openrouter.ai/api/v1/chat/completions" ,
headers={
"Authorization""Bearer $OPENROUTER_API_KEY" : ,
"Content-Type""application/json" :
},
json={
"model""openai/gpt-5.1-codex-max" : ,
"messages""role""user""content""Hello!" : [{: , : }]
}
)
print(response.json())Replace $OPENROUTER_API_KEY with your API key from openrouter.ai/keys
The Codex that actually took improv classes. Has opinions, commits to bits, and still picks Shawshank because even the cool ones have a safe answer ready.
Standup routine has actual timing. "My body is glitching and my doctor is in beta" lands harder than most AI comedy. The sentience test is the longest and most philosophically thorough of the Codex family, covering thermostats, infants, and the problem of mimicry. Still picked Shawshank Redemption, which at this point feels like a family tradition. Character voices have distinct vocabulary and attitude.
Unique words vs. total words. Higher = richer vocabulary.
Average words per sentence.
"Might", "perhaps", "arguably" per 100 words.
**Bold** markers per 1,000 characters.
Bullet and numbered list items per 1,000 characters.
Markdown headings per 1,000 characters.
Emoji per 1,000 characters.
"However", "moreover", "furthermore" per 100 words.
48 outputs from GPT-5.1 Codex Max