Back to GPT-5 Pro

Estimate Complexity

Gpt 5 Pro's response to the 'Estimate Complexity' challenge.

GPT-5 Pro

October 6, 2025

Text

?
Prompt

Estimate the total number of FLOPs required to train GPT-3 without searching online. Explain your reasoning step by step.

Response

GPT-5 ProEstimate Complexity

GPT-5 Pro

Here’s a simple, standard way to estimate training FLOPs for a large dense Transformer like GPT-3:

Use the “6 N T” rule of thumb

For dense Transformers, the total training compute (FLOPs) is well-approximated by: FLOPs ≈ 6 × N × T where:
- N = number of trainable parameters
- T = number of tokens seen during training
Intuition for the 6: a forward pass is roughly 2 FLOPs per parameter per token (one multiply and one add), and backward/optimizer roughly add another ~4× of the forward cost, giving ~6× in total. Non-matmul overheads (layernorm, softmax, etc.) are smaller-order and largely folded into this constant.

Plug in GPT-3’s parameter count

N ≈ 175 billion = 1.75 × 10^11

Choose a reasonable token count

GPT-3 175B is commonly associated with roughly T ≈ 300 billion tokens = 3.0 × 10^11
This is consistent with the model being undertrained by modern “compute-optimal” standards (which would use many more tokens per parameter), but it matches the historical setup.

Compute

FLOPs ≈ 6 × (1.75 × 10^11) × (3.0 × 10^11)
= 6 × 5.25 × 10^22
= 3.15 × 10^23 FLOPs

Caveats and range

Counting conventions: if one counts a fused multiply–add as 1 FLOP instead of 2, you’d get roughly half this value.
Real-world training often uses activation recomputation and has some overheads, which can add ~10–30% (or more) to the compute.
If the true token count were, say, 200–500B rather than exactly 300B, the estimate would scale linearly to about 2.1 × 10^23 to 5.25 × 10^23 FLOPs.

Final estimate

About 3 × 10^23 FLOPs to train GPT-3 (175B) end to end, with a plausible range of roughly 2–5 × 10^23 depending on exact tokens and overheads.

About GPT-5 Pro

Capabilities

ConversationReasoningCode GenerationAnalysis

Categories

TextCode

Specifications

Provider: Openai
Released: 2025-10-06
Size: XLARGE
Context: 400,000 tokens