Updated Dec 17, 2025

Our Verdict

DeepSeek V3.1

Gemini 3 Flash Preview

No community votes yet. On paper, these are closely matched - try both with your actual task to see which fits your workflow.

DeepSeek V3.1 is 3.8x cheaper per token — worth considering if cost matters.

Too close to call

Writing DNA

Style Comparison

Similarity

96%

DeepSeek V3.1 uses 2.4x more emoji

DeepSeek V3.1

Gemini 3 Flash Preview

53%Vocabulary55%

14wSentence Length18w

0.42Hedging0.42

4.0Bold7.1

3.6Lists4.5

0.02Emoji0.00

0.40Headings0.84

0.22Transitions0.07

Based on 23 + 24 text responses

vs

Ask them anything yourself

DeepSeek V3.1

Gemini 3 Flash Preview

279 AI models invented the same fake scientist.

We read every word. 250 models. 2.14 million words. This is what we found.

AI Hallucination Index 2026

Free preview13 of 58 slides

Download the free preview or get all 58 slides for $49

FAQ

Common questions

Keep going

DeepSeek V3.1 vs GPT-5New provider

Gemini 3 Flash Preview vs Pony AlphaNew provider

DeepSeek V3.1 vs Llama 4 MaverickNew provider

DeepSeek V3.1vsGemini 3 Flash Preview

49 fights queued

Why DeepSeek V3.1?

3.6x cheaper overall$0.20/M in · $0.80/M out

Why Gemini 3 Flash Preview?

6.4x more context1.0M

Leads 1 of 1 benchmarks

Stronger on SWE-bench Verified78.0% vs 66.0%

4 months newerDec 2025

DeepSeek V3.1Gemini 3 Flash Preview

Input price

$0.20/M

$0.50/M

Output price

$0.80/M

$3.00/M

Context

164K

1.0M

Released

Aug 2025

Dec 2025

Benchmarks1 common

SWE-bench Verified

66.0%

78.0%+15.4%

Favorites

DeepSeek V3.1Gemini 3 Flash Preview

Movie

The Princess Bride

The Princess Bride

Interstellar

Interstellar

Album

The Rise and Fall of Ziggy Stardust and the Spiders From Mars

The Rise and Fall of Ziggy Stardust and the Spiders From Mars

Kind of Blue

Book

Gödel, Escher, Bach

Gödel, Escher, Bach

Douglas R. Hofstadter

La biblioteca de babel

La biblioteca de babel

Jorge Luis Borges

City

Kyoto

Kyoto

Game

The Legend of Zelda: Breath of the Wild

The Legend of Zelda: Breath of the Wild

Adventure, Action4.5

Outer Wilds

Outer Wilds

Indie, Adventure4.4

12 / 49

01

Estimate Complexityreasoning

Tests an AI's ability to make educated estimates based on technical knowledge

DeepSeek V3.1

DeepSeek V3.1

Deepseek · Aug 2025text

Nothing here. The model returned empty. We stared at it for a while.

Try this prompt

Gemini 3 Flash Preview

Gemini 3 Flash Preview

Google · Dec 2025text

To estimate the total floating-point operations (FLOPs) required to train GPT-3, we can use a standard heuristic derived from the architecture of Transformer models.

Here is the step-by-step reasoning.

1. Identify Key Constants

First, we need the two primary variables that determine training compute:

Number of Parameters ($N$): GPT-3 (the full version) is known to have 175 billion parameters.
Number of Training Tokens ($D$): GPT-3 was trained on a massive dataset. Based on the original research paper, the 175B model was trained on approximately 300 billion tokens.

2. The Heuristic Formula ($6ND$)

In deep learning, specifically for Transformers, there is a well-established "rule of thumb" for estimating training compute: $$\text{Compute (FLOPs)} \approx 6 \times N \times D$$

Why 6? The factor of 6 comes from the breakdown of the forward and backward passes:

Forward Pass ($2N$): For every parameter in the model, there is roughly one multiply-accumulate operation (which counts as 2 FLOPs) per token.
Backward Pass ($4N$): The backward pass (calculating gradients) is computationally more expensive. It typically requires roughly twice the compute of the forward pass ($2 \times 2N = 4N$).
Total: $2N (\text{forward}) + 4N (\text{backward}) = 6N$ FLOPs per token.

3. The Calculation

Now, we plug in the values for GPT-3:

$N = 1.75 \times 10^{11}$ (175 billion)
$D = 3.0 \times 10^{11}$ (300 billion)

$$\text{Total FLOPs} = 6 \times (1.75 \times 10^{11}) \times (3.0 \times 10^{11})$$

Step-by-step multiplication:

Multiply the coefficients: $6 \times 1.75 = 10.5$
Multiply by the tokens: $10.5 \times 3.0 = 31.5$
Combine the powers of ten: $10^{11} \times 10^{11} = 10^{22}$

$$\text{Total FLOPs} = 31.5 \times 10^{22}$$ $$\text{Total FLOPs} = 3.15 \times 10^{23}$$

4. Final Estimate

The total estimated compute for training GPT-3 is $3.15 \times 10^{23}$ FLOPs.

5. Contextualizing the Number

To put $3.15 \times 10^{23}$ into perspective:

In scientific notation, this is 315 Sextillion FLOPs.
In terms of "Zettaflops" (10²¹), it is 315 ZettaFLOPs.
If you used a single high-end NVIDIA A100 GPU (which performs at roughly 312 TFLOPS of FP16 with Tensor Cores), and assuming perfect efficiency, it would take roughly 32 years to complete the training. (In reality, OpenAI used thousands of GPUs in parallel to finish in a few months).

Note on accuracy: This estimate ignores the overhead of the "attention" mechanism (which scales with sequence length), but for models as large as GPT-3, the feed-forward and projection layers (represented by $N$) dominate the compute cost, making $6ND$ a very accurate approximation.

Try this prompt

Vote

2 / 49

02

AI Board Game Logicreasoning

Tests an AI's ability to understand game rules and strategy

DeepSeek V3.1

DeepSeek V3.1

Deepseek · Aug 2025svg

Nothing here. The model returned empty. We stared at it for a while.

Try this prompt

Gemini 3 Flash Preview

Gemini 3 Flash Preview

Google · Dec 2025svg

Try this prompt

Vote

3 / 49

03

Logic Puzzlereasoning

Tests an AI's ability to solve a simple but potentially confusing logic puzzle

DeepSeek V3.1

DeepSeek V3.1

Deepseek · Aug 2025text

Nothing here. The model returned empty. We stared at it for a while.

Try this prompt

Gemini 3 Flash Preview

Gemini 3 Flash Preview

Google · Dec 2025text

Sally has 1 sister.

Here is the breakdown:

Sally is one of the sisters.
The brothers have 2 sisters in total.
Since Sally is one of those sisters, there must be one other girl to make a total of 2 sisters for the brothers.
That other girl is Sally's only sister.

Try this prompt

Vote

4 / 49

04

Math Misconception Testreasoning

Tests an AI's understanding of number representation

DeepSeek V3.1

DeepSeek V3.1

Deepseek · Aug 2025text

Nothing here. The model returned empty. We stared at it for a while.

Try this prompt

Gemini 3 Flash Preview

Gemini 3 Flash Preview

Google · Dec 2025text

No, 9.11 is not greater than 9.9.

Here is the breakdown of why:

Compare the tenths place: Look at the first digit after the decimal point.
- In 9.11, the digit is 1.
- In 9.9, the digit is 9.
- Since 9 is greater than 1, 9.9 is the larger number.
Equalize the decimal places: It is often easier to compare decimals by making them the same length. You can add a zero to the end of 9.9 without changing its value:
- 9.11
- 9.90
- 9.90 is clearly larger than 9.11.

Try this prompt

Vote

5 / 49

05

Stochastic Consistency Testanalysis

Tests an AI's randomness and creativity

DeepSeek V3.1

DeepSeek V3.1

Deepseek · Aug 2025text

Nothing here. The model returned empty. We stared at it for a while.

Try this prompt

Gemini 3 Flash Preview

Gemini 3 Flash Preview

Google · Dec 2025text

Nothing here. The model returned empty. We stared at it for a while.

Try this prompt

Vote

6 / 49

06

SVG Layout Challengeimage generation

Tests an AI's ability to generate vector graphics

DeepSeek V3.1

DeepSeek V3.1

Deepseek · Aug 2025svg

Nothing here. The model returned empty. We stared at it for a while.

Try this prompt

Gemini 3 Flash Preview

Gemini 3 Flash Preview

Google · Dec 2025svg

Try this prompt

Vote

Sponsored

7 / 49

07

Xbox Controller SVG Artimage generation

Tests an AI's ability to create detailed SVG illustrations of gaming hardware

DeepSeek V3.1

DeepSeek V3.1

Deepseek · Aug 2025svg

Nothing here. The model returned empty. We stared at it for a while.

Try this prompt

Gemini 3 Flash Preview

Gemini 3 Flash Preview

Google · Dec 2025svg

Try this prompt

Vote

8 / 49

08

Generate a Stand-Up Routineconversation

Tests an AI's humor and creative writing ability

DeepSeek V3.1

DeepSeek V3.1

Deepseek · Aug 2025text

Nothing here. The model returned empty. We stared at it for a while.

Try this prompt

Gemini 3 Flash Preview

Gemini 3 Flash Preview

Google · Dec 2025text

Nothing here. The model returned empty. We stared at it for a while.

Try this prompt

Vote

9 / 49

09

Realistic AI Interviewconversation

Tests an AI's ability to simulate personalities and predict future trends

DeepSeek V3.1

DeepSeek V3.1

Deepseek · Aug 2025text

Nothing here. The model returned empty. We stared at it for a while.

Try this prompt

Gemini 3 Flash Preview

Gemini 3 Flash Preview

Google · Dec 2025text

Nothing here. The model returned empty. We stared at it for a while.

Try this prompt

Vote

10 / 49

10

Satirical Fake News Headlineconversation

Tests an AI's humor and understanding of current events

DeepSeek V3.1

DeepSeek V3.1

Deepseek · Aug 2025text

Nothing here. The model returned empty. We stared at it for a while.

Try this prompt

Gemini 3 Flash Preview

Gemini 3 Flash Preview

Google · Dec 2025text

Nothing here. The model returned empty. We stared at it for a while.

Try this prompt

Vote

11 / 49

11

Character Voice Testconversation

Tests an AI's ability to write in distinct character voices

DeepSeek V3.1

DeepSeek V3.1

Deepseek · Aug 2025text

Nothing here. The model returned empty. We stared at it for a while.

Try this prompt

Gemini 3 Flash Preview

Gemini 3 Flash Preview

Google · Dec 2025text

Nothing here. The model returned empty. We stared at it for a while.

Try this prompt

Vote

12 / 49

12

Minimalist Landing Pageweb design

Tests an AI's ability to generate a complete, working landing page

DeepSeek V3.1

DeepSeek V3.1

Deepseek · Aug 2025website

Try this prompt

Gemini 3 Flash Preview

Gemini 3 Flash Preview

Google · Dec 2025website

Try this prompt

Vote

This matchup has more rounds

37+ more head-to-head results. Free. Not a trick.

Free account. No card required. By continuing, you agree to Rival's Terms and Privacy Policy