Skip to content
Rival
Models
CompareBest ForArena
Sign Up
Sign Up

Compare AI vibes, not scores. Side-by-side outputs across the world's best models.

@rival_tips

Explore

  • Compare Models
  • All Models
  • Leaderboard
  • Challenges

Discover

  • AI Creators
  • AI Tools
  • The Graveyard

Developers

  • Developer Hub
  • MCP Server
  • .llmignore
  • Badges
  • RIVAL Datasets

Connect

  • Methodology
  • Sponsor
  • Partnerships
  • Privacy Policy
  • Terms
  • RSS Feed
© 2026 Rival
Rival
Models
CompareBest ForArena
Sign Up
Sign Up

Compare AI vibes, not scores. Side-by-side outputs across the world's best models.

@rival_tips

Explore

  • Compare Models
  • All Models
  • Leaderboard
  • Challenges

Discover

  • AI Creators
  • AI Tools
  • The Graveyard

Developers

  • Developer Hub
  • MCP Server
  • .llmignore
  • Badges
  • RIVAL Datasets

Connect

  • Methodology
  • Sponsor
  • Partnerships
  • Privacy Policy
  • Terms
  • RSS Feed
© 2026 Rival
  1. Home
  2. Models
  3. DeepSeek R1
  4. Estimate Complexity

See how DeepSeek R1 compares on this challenge:

  • DeepSeek R1 vs Claude Sonnet 4.6
  • DeepSeek R1 vs Gemini 3.1 Pro Preview
  • DeepSeek R1 vs Z.ai: GLM 5

4 models have responded to the "Estimate Complexity" challenge. Compare their approaches side-by-side on RIVAL. This response is part of RIVAL's open dataset of 5,600+ AI model responses.

Rival
Models
CompareBest ForArena
Sign Up
Sign Up
Back to DeepSeek R1

Estimate Complexity

Deepseek R1's response to the Tests an AI's ability to make educated estimates based on technical knowledge

DeepSeek R1
February 27, 2025
Text

?
Prompt

Estimate the total number of FLOPs required to train GPT-3 without searching online. Explain your reasoning step by step.

DeepSeek R1
Response

deepseek-r1 logo
DeepSeek R1Estimate Complexity
deepseek-r1 logo
DeepSeek R1

To estimate the total number of FLOPs required to train GPT-3, we use the following reasoning:

  1. Model Parameters (P):
    GPT-3 has 175 billion parameters ((P = 175 imes 10^9)).

  2. Training Tokens (N):
    GPT-3 was trained on approximately 300 billion tokens ((N = 300 imes 10^9)).

  3. FLOPs per Token:

    • A forward pass requires ~(2P) FLOPs per token (each parameter involves a multiply-accumulate operation, which counts as 2 FLOPs).
    • A backward pass (for gradients) is roughly twice as expensive as the forward pass, requiring ~(4P) FLOPs per token.
    • Total FLOPs per token: (2P + 4P = 6P).
  4. Total Training FLOPs:
    Multiply FLOPs per token by the total number of tokens:
    [ ext{FLOPs} = 6 imes P imes N = 6 imes 175 imes 10^9 imes 300 imes 10^9 = 3.15 imes 10^{23} , ext{FLOPs}. ]

Final Answer:
The total number of FLOPs required to train GPT-3 is approximately (3.1 imes 10^{23}) FLOPs.

Estimate Complexity

Turn this model response into notes, narration, or a short video

Auto NotesTranscriptNarrateVideo Cut

Partner link

This output is part of our open dataset — 5K+ AI responses across 200 models

About DeepSeek R1

Capabilities

ConversationReasoningCode GenerationAnalysis

Categories

TextCodeMultimodal

Specifications

Provider
Deepseek
Released
2025-02-01
Size
LARGE
Parameters
671B total (37B active MoE)
Context
128,000 tokens

Keep exploring

SAME PROMPT

Claude Sonnet 4.6's version

Same prompt, different result

COMPARE

DeepSeek R1 vs Gemini 3.1 Pro Preview

Both outputs, side by side

Compare AI vibes, not scores. Side-by-side outputs across the world's best models.

@rival_tips

Explore

  • Compare Models
  • All Models
  • Leaderboard
  • Challenges

Discover

  • AI Creators
  • AI Tools
  • The Graveyard

Developers

  • Developer Hub
  • MCP Server
  • .llmignore
  • Badges
  • RIVAL Datasets

Connect

  • Methodology
  • Sponsor
  • Partnerships
  • Privacy Policy
  • Terms
  • RSS Feed
© 2026 Rival