4 models have responded to the "Estimate Complexity" challenge. Compare their approaches side-by-side on Rival. This response is part of Rival's open dataset of 5,600+ AI model responses.
Gpt 5.3 Codex's response to the 'Estimate Complexity' challenge.
Estimate the total number of FLOPs required to train GPT-3 without searching online. Explain your reasoning step by step.
A common back-of-the-envelope estimate for transformer training compute is:
[ \text{Training FLOPs} \approx 6 \times N \times D ]
where:
For GPT-3 (175B), reasonable known assumptions are:
Now plug in:
[ 6 \times (1.75 \times 10^{11}) \times (3.0 \times 10^{11}) = 6 \times 5.25 \times 10^{22} = 3.15 \times 10^{23} \text{ FLOPs} ]
[ \boxed{\text{Total training compute} \approx 3 \times 10^{23}\ \text{FLOPs}} ]
A reasonable range (depending on exact token count, sequence-length effects, optimizer/update overhead, and hardware efficiency) is roughly:
[ \boxed{(2.5\text{ to }4)\times10^{23}\ \text{FLOPs}} ]
So the headline answer is: on the order of (10^{23}) FLOPs, about (3.1\times10^{23}).
Turn this model response into notes, narration, or a short video
Partner link