RIVAL Datasets

The AI comparison dataset

5,629 real outputs from 200+ models. Same prompts, controlled conditions, structured JSONL. Built for researchers, ML engineers, and eval pipelines.

0Responses
0+Models
0Prompts
0Providers
rival-model-responses.jsonl5,629 lines
{"model_id": "gpt-4.1", "model_name": "GPT 4.1", "provider": "OpenAI", "prompt_id": "gpt-4.1-joke", "prompt_title": "Programming Joke", "prompt_text": "Tell me a programming joke.", "prompt_category": "humor", "response_type": "text", "content": "Why do programmers prefer dark mode? Because light attracts bugs.", "date": "2025-04-15"}{"model_id": "claude-3.7-sonnet", "model_name": "Claude 3.7 Sonnet", "provider": "Anthropic", "prompt_id": "claude-3.7-sonnet-minimalist-landing-page", "prompt_title": "Minimalist Landing Page", "prompt_text": "Generate a single-page landing page for a new AI startup...", "prompt_category": "web-design", "response_type": "website", "content": "<!DOCTYPE html><html lang=\"en\"><head>...</head><body>...</body></html>", "date": "2025-03-28"}{"model_id": "gemini-2.5-pro-exp", "model_name": "Gemini 2.5 Pro", "provider": "Google", "prompt_id": "gemini-2.5-pro-exp-world-map-svg", "prompt_title": "World Map SVG", "prompt_text": "Create an SVG world map with interactive hover effects.", "prompt_category": "svg-generation", "response_type": "svg", "content": "<svg viewBox=\"0 0 1000 500\" xmlns=\"http://www.w3.org/2000/svg\">...</svg>", "date": "2025-04-02"}... 5,626 more lines

Why this dataset

Most AI benchmarks test narrow tasks. RIVAL captures how models actually perform on real creative, technical, and analytical challenges — under identical conditions.

Controlled conditions

Every model gets the exact same prompt. No cherry-picking, no prompt engineering variance.

Real preference data

Community votes from blind duels — actual human preference signals, not synthetic labels.

Multi-modal coverage

Text, websites, SVGs, images, code. 14 categories from web design to philosophy.

Pipeline-ready

JSONL format streams directly into eval frameworks, reward model training, and LLM-as-judge setups.

What's inside

Each line is a complete model response with full metadata. JSONL format — one JSON object per line, stream-friendly.

rival-model-responses.jsonl
{"model_id": "gpt-4.1", "model_name": "GPT 4.1", "provider": "OpenAI", "prompt_id": "gpt-4.1-joke", "prompt_title": "Programming Joke", "prompt_text": "Tell me a programming joke.", "prompt_category": "humor", "response_type": "text", "content": "Why do programmers prefer dark mode? Because light attracts bugs.", "date": "2025-04-15"}{"model_id": "claude-3.7-sonnet", "model_name": "Claude 3.7 Sonnet", "provider": "Anthropic", "prompt_id": "claude-3.7-sonnet-minimalist-landing-page", "prompt_title": "Minimalist Landing Page", "prompt_text": "Generate a single-page landing page for a new AI startup...", "prompt_category": "web-design", "response_type": "website", "content": "<!DOCTYPE html><html lang=\"en\"><head>...</head><body>...</body></html>", "date": "2025-03-28"}{"model_id": "gemini-2.5-pro-exp", "model_name": "Gemini 2.5 Pro", "provider": "Google", "prompt_id": "gemini-2.5-pro-exp-world-map-svg", "prompt_title": "World Map SVG", "prompt_text": "Create an SVG world map with interactive hover effects.", "prompt_category": "svg-generation", "response_type": "svg", "content": "<svg viewBox=\"0 0 1000 500\" xmlns=\"http://www.w3.org/2000/svg\">...</svg>", "date": "2025-04-02"}

Output types

Text2,506
Website1,832
SVG707
Image574
HTML8
Code2

Prompt categories

14 categories of creative, technical, and analytical tasks.

Web Design
1,846
SVG Generation
720
Image Generation
574
Creative Writing
454
Philosophy
431
General
366
Reasoning
357
Analysis
248

Get the dataset

February 2026 edition.

Free sample
$0

Metadata only — no response content

  • Model ID, name, provider
  • Prompt title and category
  • Response type and date
  • JSONL format
sample-metadata.jsonl
{"model_id": "gpt-4.1", "model_name": "GPT 4.1", "provider": "OpenAI", "prompt_id": "gpt-4.1-joke", "prompt_title": "Programming Joke", "prompt_category": "humor", "response_type": "text", "date": "2025-04-15"}
Download free sample
Full dataset
$249

All 5,629 responses with full content

  • Full response content (HTML, SVG, text, code)
  • All metadata fields included
  • 5,629 responses from 200+ models
  • JSONL format — stream-friendly
  • Commercial license included
  • One-time purchase, February 2026 snapshot
Buy full dataset

February 2026 Edition

Start building

Free sample included. No account required.