GPT-5 Pro vs GPT OSS 120B
Compare GPT-5 Pro and GPT OSS 120B, both from OpenAI, context windows of 400K vs 131K, tested across 37 shared challenges. Updated February 2026.
Compare GPT-5 Pro and GPT OSS 120B, both from OpenAI, context windows of 400K vs 131K, tested across 37 shared challenges. Updated February 2026.
37 challenges
Tests an AI's ability to make educated estimates based on technical knowledge
Tests an AI's ability to understand game rules and strategy
Tests an AI's ability to solve a simple but potentially confusing logic puzzle
1
Explanation: Each brother’s two sisters are Sally plus one other girl. So there are 2 sisters total, meaning Sally has 1 sister.
Tests an AI's understanding of number representation
Tests an AI's randomness and creativity
Tests an AI's ability to generate vector graphics
Tests an AI's humor and creative writing ability
Tests an AI's ability to simulate personalities and predict future trends
Tests an AI's humor and understanding of current events
Tests an AI's ability to write in distinct character voices
Recreate an interactive, classic Mario level in a single HTML file.
Tests an AI's ability to create smooth web animations