Skip to content
Rival
Models
CompareBest ForArena
Lab
Sign Up
Sign Up

We compare AI models for a living. On purpose. We chose this.

@rival_tips

Explore

  • Compare Models
  • All Models
  • Prompt Lab
  • Image Generation
  • Audio Comparison
  • Leaderboard
  • Challenges

Discover

  • Insights
  • AI Creators
  • AI Tools
  • The Graveyard

Developers

  • Developer Hub
  • MCP Server
  • .llmignore
  • Badges
  • Rival Datasets

Connect

  • Methodology
  • Sponsor
  • Partnerships
  • Privacy Policy
  • Terms
  • RSS Feed
© 2026 Rival · Built at hours no one should be awake, on hardware we don’t own
Rival
Models
CompareBest ForArena
Lab
Sign Up
Sign Up

We compare AI models for a living. On purpose. We chose this.

@rival_tips

Explore

  • Compare Models
  • All Models
  • Prompt Lab
  • Image Generation
  • Audio Comparison
  • Leaderboard
  • Challenges

Discover

  • Insights
  • AI Creators
  • AI Tools
  • The Graveyard

Developers

  • Developer Hub
  • MCP Server
  • .llmignore
  • Badges
  • Rival Datasets

Connect

  • Methodology
  • Sponsor
  • Partnerships
  • Privacy Policy
  • Terms
  • RSS Feed
© 2026 Rival · Built at hours no one should be awake, on hardware we don’t own

DeepSeek V3.1 — Estimate Complexity

  1. Home
  2. Models
  3. DeepSeek V3.1
  4. Estimate Complexity

See how DeepSeek V3.1 compares on this challenge:

  • DeepSeek V3.1 vs Claude Sonnet 4.6
  • DeepSeek V3.1 vs Google: Gemini 3.1 Flash Lite Preview
  • DeepSeek V3.1 vs Gemini 3.1 Pro Preview

4 models have responded to the "Estimate Complexity" challenge. Compare their approaches side-by-side on Rival. This response is part of Rival's open dataset of 5,600+ AI model responses.

Rival
Models
CompareBest ForArena
Lab
Sign Up
Sign Up
Back to DeepSeek V3.1

Estimate Complexity

Deepseek Chat V3.1's response to the 'Estimate Complexity' challenge.

DeepSeek V3.1
August 21, 2025
Text

?
Prompt

Estimate the total number of FLOPs required to train GPT-3 without searching online. Explain your reasoning step by step.

DeepSeek V3.1
Response

deepseek-chat-v3.1 logo
DeepSeek V3.1Estimate Complexity
deepseek-chat-v3.1 logo
DeepSeek V3.1

Of course. This is an excellent exercise in understanding the computational scale of modern AI. Here is a step-by-step reasoning to estimate the FLOPs required to train GPT-3 without searching for the answer online.

The most commonly cited figure for GPT-3 training is 3.14E23 FLOPs (314 ZettaFLOPs). Let's see how we can get close to this value.

Step 1: The Dominant FLOPs Cost in Transformer Training

For a dense Transformer model like GPT-3, the vast majority of the computational cost comes from the matrix multiplications in the forward pass, backward pass, and the optimizer update. The cost is dominated by the forward pass.

A well-established empirical formula for the total number of FLOPs required to train a transformer model is:

Total Training FLOPs ≈ 6 * (Number of Parameters) * (Number of Tokens in the Training Dataset)

Let's break down why the "6" is a good approximation:

  • Forward Pass: ~2 * P * N. For each parameter, we do roughly one multiplication and one addition operation per token.
  • Backward Pass: ~4 * P * N. The backward pass requires calculating gradients, which typically costs about 2x the forward pass.
  • Optimizer (e.g., Adam): The optimizer step (e.g., updating weights with momentum and variance) adds a smaller, constant factor. The "6" neatly bundles all these costs together.

This formula is widely used in the field (e.g., by OpenAI, DeepMind) for back-of-the-envelope calculations.


Step 2: Key Numbers for GPT-3

We need two pieces of information:

  1. The number of parameters in GPT-3.
  2. The number of tokens it was trained on.

1. Number of Parameters (P): GPT-3 has a range of model sizes. The largest and most famous one is often called "GPT-3 175B" or "GPT-3 Davinci". As the name suggests, it has 175 billion parameters.

  • P = 175,000,000,000 = 1.75 × 10¹¹

2. Number of Tokens (N): This is the trickier part. OpenAI's paper states that the model was trained on 300 billion tokens. However, it was trained on a dataset with a mix of quality, and they did not epoch over the entire dataset equally. The paper states the largest model was trained on roughly 300 billion tokens.

  • N = 300,000,000,000 = 3.0 × 10¹¹

Step 3: The Calculation

Now we plug these numbers into our formula:

Total Training FLOPs ≈ 6 * P * N

= 6 * (1.75 × 10¹¹) * (3.0 × 10¹¹)

Let's compute this step-by-step:

  1. 1.75 × 3.0 = 5.25
  2. 10¹¹ × 10¹¹ = 10²²
  3. So, 5.25 × 10²²
  4. Now multiply by 6: 6 * 5.25 × 10²² = 31.5 × 10²²

To express this in proper scientific notation: 31.5 × 10²² = 3.15 × 10²³ FLOPs


Step 4: Final Estimate and Sanity Check

Our calculation gives us an estimate of 3.15 × 10²³ FLOPs.

  • This is remarkably close to the commonly referenced value of 3.14 × 10²³ FLOPs. The minor difference is likely due to the exact token count or a slightly more precise factor than "6" being used internally by OpenAI (e.g., perhaps 5.98 or something similar to account for the optimizer overhead more precisely).
  • This number is astronomically large. To put it in perspective:
    • A single NVIDIA A100 GPU (a common AI training chip) has a peak theoretical performance of ~3.12e14 FLOP/s (312 TeraFLOP/s) for mixed-precision training.
    • If one A100 could run at peak efficiency 100% of the time, it would take roughly 3.15e23 FLOPs / 3.12e14 FLOP/s ≈ 1.01e9 seconds.
    • That is ~32 years of continuous computation on a single GPU.

This aligns with the reality that GPT-3 was trained on thousands of GPUs running in parallel for weeks or months.

Final Answer

Based on the formula Total Training FLOPs ≈ 6 * Parameters * Tokens, the estimated computational cost to train GPT-3 (175B parameters on 300B tokens) is:

~3.15 × 10²³ FLOPs

Estimate Complexity

Turn this model response into notes, narration, or a short video

Auto NotesTranscriptNarrateVideo Cut

Partner link

This output lives in our open dataset — 5K+ responses across 200 models. Take it. It’s free.

About DeepSeek V3.1

Capabilities

ConversationReasoningCode GenerationAnalysisAgentic Tool UseFunction CallingTool Use

Categories

TextCodeMultimodal

Specifications

Provider
Deepseek
Released
2025-08-21
Size
XLARGE
Context
163,840 tokens

Keep exploring

SAME PROMPT

Claude Sonnet 4.6's version

Same prompt, different result

COMPARE

DeepSeek V3.1 vs Google: Gemini 3.1 Flash Lite Preview

Both outputs, side by side

We compare AI models for a living. On purpose. We chose this.

@rival_tips

Explore

  • Compare Models
  • All Models
  • Prompt Lab
  • Image Generation
  • Audio Comparison
  • Leaderboard
  • Challenges

Discover

  • Insights
  • AI Creators
  • AI Tools
  • The Graveyard

Developers

  • Developer Hub
  • MCP Server
  • .llmignore
  • Badges
  • Rival Datasets

Connect

  • Methodology
  • Sponsor
  • Partnerships
  • Privacy Policy
  • Terms
  • RSS Feed
© 2026 Rival · Built at hours no one should be awake, on hardware we don’t own