DeepSeek V3.1 vs Gemini 3 Pro Preview
Compare DeepSeek V3.1 by DeepSeek against Gemini 3 Pro Preview by Google AI, tested across 49 shared challenges. Updated February 2026.
Compare DeepSeek V3.1 by DeepSeek against Gemini 3 Pro Preview by Google AI, tested across 49 shared challenges. Updated February 2026.
44 challenges
Tests an AI's ability to make educated estimates based on technical knowledge
To estimate the total FLOPs (floating-point operations) required to train GPT-3, we can use a standard approximation formula derived from the architecture of Transformer models.
Here is the step-by-step breakdown of the estimation:
To calculate training compute, we need two primary numbers: the number of parameters in the model ($P$) and the number of tokens the model was trained on ($D$).
For dense Transformer models, a widely accepted heuristic (often cited in papers by Kaplan et al. and DeepMind regarding scaling laws) is that training requires approximately 6 FLOPs per parameter per training token.
Therefore, the formula for Total Training Compute ($C$) is: $$C \approx 6 \times P \times D$$
Now, substitute the values into the formula:
$$C \approx 6 \times (175 \times 10^9) \times (300 \times 10^9)$$
First, group the integers and the powers of 10: $$C \approx (6 \times 175 \times 300) \times (10^9 \times 10^9)$$
Calculate the integers:
Combine with the exponent ($10^{18}$): $$C \approx 315,000 \times 10^{18} \text{ FLOPs}$$
To make the number readable, we normalize it:
$$315,000 = 3.15 \times 10^5$$
Combine exponents: $$3.15 \times 10^5 \times 10^{18} = 3.15 \times 10^{23}$$
The estimated total compute required to train GPT-3 is approximately:
$$3.14 \times 10^{23} \text{ FLOPs}$$
(Note: In the AI literature, this is often converted to "Petaflop-days." One Petaflop-day is approximately $8.64 \times 10^{19}$ FLOPs. Dividing $3.14 \times 10^{23}$ by that number yields roughly 3,640 Petaflop-days, which aligns closely with OpenAI's published figures.)
Tests an AI's ability to understand game rules and strategy
Tests an AI's ability to solve a simple but potentially confusing logic puzzle
Tests an AI's understanding of number representation
Tests an AI's randomness and creativity
Tests an AI's ability to generate vector graphics
Tests an AI's ability to create detailed SVG illustrations of gaming hardware
Tests an AI's humor and creative writing ability
Tests an AI's ability to simulate personalities and predict future trends
Tests an AI's humor and understanding of current events
Tests an AI's ability to generate a complete, working landing page
Recreate an interactive, nostalgic Pokémon battle UI in a single HTML file.