4 models have responded to the "Estimate Complexity" challenge. Compare their approaches side-by-side on RIVAL. This response is part of RIVAL's open dataset of 5,600+ AI model responses.
Qwen Plus 2025 07 28's response to the 'Estimate Complexity' challenge.
Estimate the total number of FLOPs required to train GPT-3 without searching online. Explain your reasoning step by step.
To estimate the total number of floating-point operations (FLOPs) required to train GPT-3, we can use a widely adopted approximation based on the model size, the number of training tokens, and the architecture of transformer-based models. We’ll do this without looking up specific values, relying only on known public disclosures and reasonable assumptions.
For transformer models, a well-known approximation for the total training FLOPs is:
[ \text{Total FLOPs} \approx 6 \times N \times D ]
Where:
The factor of 6 comes from empirical analysis that accounts for the computational cost of forward pass, backward pass, and parameter updates. Specifically:
So total is roughly 6N FLOPs per token, hence:
[ \text{Total FLOPs} \approx 6 \times N \times D ]
GPT-3 was released with multiple sizes, but the largest version had 175 billion parameters.
So: [ N = 175 \times 10^9 = 1.75 \times 10^{11} ]
From public papers (e.g., the GPT-3 paper by Brown et al., 2020), the model was trained on 300 billion tokens.
So: [ D = 300 \times 10^9 = 3 \times 10^{11} ]
(Note: This is a known figure from the literature, but assuming we recall it from general AI knowledge.)
[ \text{Total FLOPs} \approx 6 \times (1.75 \times 10^{11}) \times (3 \times 10^{11}) ]
First, multiply the coefficients: [ 6 \times 1.75 \times 3 = 6 \times 5.25 = 31.5 ]
Now the exponents: [ 10^{11} \times 10^{11} = 10^{22} ]
So: [ \text{Total FLOPs} \approx 31.5 \times 10^{22} = 3.15 \times 10^{23} ]
[ \boxed{3.15 \times 10^{23}} \text{ FLOPs} ]
This is a standard estimate used in the field and aligns with known reports (e.g., Kaplan et al. scaling laws), even without searching online.
Turn this model response into notes, narration, or a short video
Partner link