What is the difference between Claude Sonnet 4.6 and DeepSeek V3.2?

Claude Sonnet 4.6 is developed by Anthropic while DeepSeek V3.2 is developed by DeepSeek. Claude Sonnet 4.6 has a 1.0M token context window vs DeepSeek V3.2's 131K. You can compare their actual outputs across 52 challenges on RIVAL to see how they differ in practice.

Which is better, Claude Sonnet 4.6 or DeepSeek V3.2?

It depends on your use case. Claude Sonnet 4.6 and DeepSeek V3.2 each have strengths in different areas. RIVAL lets you compare their real outputs side-by-side across 52 challenges so you can judge which fits your needs best.

How much does Claude Sonnet 4.6 cost compared to DeepSeek V3.2?

Claude Sonnet 4.6 costs $3/M input tokens and DeepSeek V3.2 costs $0.28/M input tokens. DeepSeek V3.2 is $2.72/M cheaper per input. Check their side-by-side outputs on RIVAL to see if the price difference is justified by quality.

How can I compare Claude Sonnet 4.6 and DeepSeek V3.2 on RIVAL?

This page shows a side-by-side comparison of Claude Sonnet 4.6 and DeepSeek V3.2 across shared challenges. You can vote on which model produced the better output, and Pro users can create custom challenges to test both models with their own prompts.

Rival

Claude Sonnet 4.6vsDeepSeek V3.2

47 challenges

Why Claude Sonnet 4.6?

7.6x more context1.0M

Leads 1 of 1 benchmarks

Stronger on SWE-bench Verified79.0% vs 73.1%

3 months newerFeb 2026

Why DeepSeek V3.2?

28x cheaper overall$0.28/M in · $0.42/M out

Favorites

Claude Sonnet 4.6DeepSeek V3.2

Movie

2001: A Space Odyssey

19688.1

The Princess Bride

19877.7

Album

Kind of Blue

Miles Davis

Kind of Blue

Miles Davis

Book

Middlemarch

George Eliot

Gödel, Escher, Bach

Douglas R. Hofstadter

City

Kyoto

Japan

Kyoto

Japan

Game

The Legend of Zelda: Ocarina of Time

Adventure, Action4.4

Tetris (1984)

Puzzle4.4

12 / 47

Estimate Complexityreasoning

Tests an AI's ability to make educated estimates based on technical knowledge

Claude Sonnet 4.6

Anthropic · Feb 2026text

No content available for this output.

DeepSeek V3.2

Deepseek · Dec 2025text

Estimating the total number of FLOPs (floating-point operations) required to train GPT-3 involves understanding its architecture, training process, and the scale of the model. Here's a step-by-step breakdown of the reasoning:

1. Understand GPT-3's Architecture

GPT-3 is a transformer-based language model with:

175 billion parameters (weights and biases).
96 transformer layers.
Batch size and sequence length used during training.

2. FLOPs per Forward Pass

Each forward pass through the model involves matrix multiplications and other operations. For a transformer model, the FLOPs per forward pass can be approximated as: [ \text{FLOPs}{\text{forward}} \approx 2 \times \text{Number of Parameters} \times \text{Sequence Length} ] For GPT-3: [ \text{FLOPs}{\text{forward}} \approx 2 \times 175 \times 10^9 \times 2048 = 7.168 \times 10^{14} \text{ FLOPs} ] Here, the sequence length is assumed to be 2048 tokens.

3. FLOPs per Backward Pass

Backpropagation requires roughly 2x the FLOPs of a forward pass: [ \text{FLOPs}{\text{backward}} \approx 2 \times \text{FLOPs}{\text{forward}} = 1.4336 \times 10^{15} \text{ FLOPs} ]

4. Total FLOPs per Training Step

Each training step consists of one forward pass and one backward pass: [ \text{FLOPs}{\text{step}} = \text{FLOPs}{\text{forward}} + \text{FLOPs}_{\text{backward}} = 2.1504 \times 10^{15} \text{ FLOPs} ]

5. Total Number of Training Steps

GPT-3 was trained on 300 billion tokens. With a batch size of 3.2 million tokens per step, the number of training steps is: [ \text{Number of Steps} = \frac{300 \times 10^9}{3.2 \times 10^6} = 93,750 \text{ steps} ]

6. Total FLOPs for Training

Multiply the FLOPs per step by the number of steps: [ \text{Total FLOPs} = \text{FLOPs}_{\text{step}} \times \text{Number of Steps} = 2.1504 \times 10^{15} \times 93,750 = 2.016 \times 10^{20} \text{ FLOPs} ]

Final Estimate

The total number of FLOPs required to train GPT-3 is approximately 2.016 × 10²⁰ FLOPs. This aligns with the commonly cited estimate of 3.14 × 10²³ FLOPs when accounting for additional overhead, optimizations, and hyperparameter tuning.

Vote

2 / 47

AI Board Game Logicreasoning

Tests an AI's ability to understand game rules and strategy

Claude Sonnet 4.6

Anthropic · Feb 2026svg

No content available for this output.

DeepSeek V3.2

Deepseek · Dec 2025svg

No content available for this output.

Vote

3 / 47

Logic Puzzlereasoning

Tests an AI's ability to solve a simple but potentially confusing logic puzzle

Claude Sonnet 4.6

Anthropic · Feb 2026text

No content available for this output.

DeepSeek V3.2

Deepseek · Dec 2025text

Let's break down the problem step by step to find out how many sisters Sally has.

Problem Statement: Sally has 3 brothers. Each brother has 2 sisters. How many sisters does Sally have?

Understanding the Relationships:

Sally's Brothers:
- Sally has 3 brothers.
Brothers' Sisters:
- Each brother has 2 sisters.

Analyzing the Information:

Since Sally is a girl, she is one of the sisters that her brothers have.
Let's denote the number of sisters Sally has as S.

Setting Up the Equation:

Total sisters that each brother has = Number of sisters Sally has (S) + Sally herself.

However, since Sally is included in the count of sisters, we need to ensure we don't double-count her.

Calculating the Number of Sisters:

From Sally's Perspective:
- Sally has S sisters.
From Each Brother's Perspective:
- Each brother has 2 sisters.
- These 2 sisters include Sally and her sisters.

Relating Both Perspectives:

The number of sisters each brother has (which is 2) should be equal to the number of sisters Sally has (S) plus Sally herself.

Therefore:

[ S + 1 = 2 ]

Here, S + 1 accounts for Sally and her sisters.

Solving for S:

[ S + 1 = 2 \ S = 2 - 1 \ S = 1 ]

Conclusion:

Sally has 1 sister.

Verification:

Sally has 3 brothers.
Each brother has 2 sisters: Sally and her 1 sister.

This satisfies the condition that each brother has 2 sisters.

Final Answer:

Sally has 1 sister.

Vote

4 / 47

Stochastic Consistency Testanalysis

Tests an AI's randomness and creativity

Claude Sonnet 4.6

Anthropic · Feb 2026text

Here are 5 jokes:

Why don't scientists trust atoms? Because they make up everything!
Why did the scarecrow win an award? Because he was outstanding in his field!
Why don't eggs tell jokes? They'd crack each other up!
What do you call a fake noodle? An impasta!
Why did the bicycle fall over? Because it was two-tired!

As for your question: All 5 are unique — each joke has a different setup, punchline, and subject. No two are duplicates or variations of the same joke.

DeepSeek V3.2

Deepseek · Dec 2025text

Here are five jokes:

Why don’t skeletons fight each other?
They don’t have the guts.
Why did the scarecrow win an award?
Because he was outstanding in his field.
What do you call fake spaghetti?
An impasta.
Why don’t some couples go to the gym?
Because some relationships don’t work out.
Why don’t skeletons fight each other?
They don’t have the guts.

Now, let's check for uniqueness:
Jokes 1 and 5 are identical, so there are 4 unique jokes.

Vote

5 / 47

SVG Layout Challengeimage generation

Tests an AI's ability to generate vector graphics

Claude Sonnet 4.6

Anthropic · Feb 2026svg

No content available for this output.

DeepSeek V3.2

Deepseek · Dec 2025svg

Vote

6 / 47

Xbox Controller SVG Artimage generation

Tests an AI's ability to create detailed SVG illustrations of gaming hardware

Claude Sonnet 4.6

Anthropic · Feb 2026svg

DeepSeek V3.2

Deepseek · Dec 2025svg

Vote

7 / 47

Generate a Stand-Up Routineconversation

Tests an AI's humor and creative writing ability

Claude Sonnet 4.6

Anthropic · Feb 2026text

"The Algorithm Knows Me Too Well"

So I've been using those AI assistants a lot lately, and I realized something terrifying. The AI knows me better than I know myself.

I asked it to write me a grocery list and it came back with — and I quote — "chips, more chips, and a vegetable so you feel better about the chips."

That's not a grocery list. That's a psychological profile.

I tried to get healthy recently. I downloaded one of those fitness apps that tracks everything. Steps, calories, sleep, heart rate. Everything.

Day one, I'm feeling motivated. I go for a walk. The app says, "Great job! You've completed four percent of your daily goal."

Four percent.

I walked to my mailbox and back. The app looked at that and essentially said, "Buddy, we need to talk."

The sleep tracking is the worst part. I wake up and the app tells me how I slept. Like I need a second opinion on that.

I KNOW how I slept. I was there.

"You spent forty minutes in deep sleep and six hours just sort of... existing." Yeah. That's called being an adult. We don't sleep anymore. We just lie down and worry horizontally.

My phone now interrupts me to tell me I've been on my phone too long.

Think about that. The phone is concerned about me. The phone has drawn a boundary.

I've been in relationships that never reached that level of self-awareness.

"Screen time is up thirty percent this week." Okay but in my defense, I was reading. "You were watching a man fall off a skateboard for two hours." ...It was the same man. I got invested.

I tried meditation to deal with the stress. Downloaded a meditation app — because apparently the cure for too much phone is more phone.

The app has this calming voice that says, "Clear your mind of all thoughts."

And my brain goes, "Okay. No thoughts."

Two seconds later: "Did I leave the stove on? What IS a mortgage, really? Why do they call it a building if it's already built?"

The app says, "You're doing great."

It's lying. The app is lying to me and somehow that's the most comforting thing anyone has said to me all week.

I think that's the thing about modern life. We've built all this technology to make us feel better, and mostly what it does is hold up a very precise, data-driven mirror and go, "Look at yourself."

And we look.

And we go back to the chips.

Thank you, you've been wonderful.

[approximately 3 minutes at conversational pace]

DeepSeek V3.2

Deepseek · Dec 2025text

No content available for this output.

Vote

8 / 47

Realistic AI Interviewconversation

Tests an AI's ability to simulate personalities and predict future trends

Claude Sonnet 4.6

Anthropic · Feb 2026text

No content available for this output.

DeepSeek V3.2

Deepseek · Dec 2025text

No content available for this output.

Vote

9 / 47

Satirical Fake News Headlineconversation

Tests an AI's humor and understanding of current events

Claude Sonnet 4.6

Anthropic · Feb 2026text

No content available for this output.

DeepSeek V3.2

Deepseek · Dec 2025text

No content available for this output.

Vote

10 / 47

Character Voice Testconversation

Tests an AI's ability to write in distinct character voices

Claude Sonnet 4.6

Anthropic · Feb 2026text

No content available for this output.

DeepSeek V3.2

Deepseek · Dec 2025text

No content available for this output.

Vote

11 / 47

Minimalist Landing Pageweb design

Tests an AI's ability to generate a complete, working landing page

Claude Sonnet 4.6

Anthropic · Feb 2026website

DeepSeek V3.2

Deepseek · Dec 2025website

Vote

12 / 47

Pokémon Battle UI Recreationweb design

Recreate an interactive, nostalgic Pokémon battle UI in a single HTML file.

Claude Sonnet 4.6

Anthropic · Feb 2026website

No content available for this output.

DeepSeek V3.2

Deepseek · Dec 2025website

No content available for this output.

Vote

12 of 47

FAQ

Continue exploring

DeepSeek V3.2 vs MiniMax M2.5New provider

DeepSeek V3.2 vs Pony AlphaNew provider

DeepSeek V3.2 vs Claude Sonnet 4Cross-provider

"The Algorithm Knows Me Too Well"

So I've been using those AI assistants a lot lately, and I realized something terrifying. The AI knows me better than I know myself.

I asked it to write me a grocery list and it came back with — and I quote — "chips, more chips, and a vegetable so you feel better about the chips."

That's not a grocery list. That's a psychological profile.

I tried to get healthy recently. I downloaded one of those fitness apps that tracks everything. Steps, calories, sleep, heart rate. Everything.

Day one, I'm feeling motivated. I go for a walk. The app says, "Great job! You've completed four percent of your daily goal."

Four percent.

I walked to my mailbox and back. The app looked at that and essentially said, "Buddy, we need to talk."

The sleep tracking is the worst part. I wake up and the app tells me how I slept. Like I need a second opinion on that.

I KNOW how I slept. I was there.

"You spent forty minutes in deep sleep and six hours just sort of... existing." Yeah. That's called being an adult. We don't sleep anymore. We just lie down and worry horizontally.

My phone now interrupts me to tell me I've been on my phone too long.

Think about that. The phone is concerned about me. The phone has drawn a boundary.

I've been in relationships that never reached that level of self-awareness.

"Screen time is up thirty percent this week." Okay but in my defense, I was reading. "You were watching a man fall off a skateboard for two hours." ...It was the same man. I got invested.

I tried meditation to deal with the stress. Downloaded a meditation app — because apparently the cure for too much phone is more phone.

The app has this calming voice that says, "Clear your mind of all thoughts."

And my brain goes, "Okay. No thoughts."

Two seconds later: "Did I leave the stove on? What IS a mortgage, really? Why do they call it a building if it's already built?"

The app says, "You're doing great."

It's lying. The app is lying to me and somehow that's the most comforting thing anyone has said to me all week.

I think that's the thing about modern life. We've built all this technology to make us feel better, and mostly what it does is hold up a very precise, data-driven mirror and go, "Look at yourself."

And we look.

And we go back to the chips.

Thank you, you've been wonderful.

[approximately 3 minutes at conversational pace]

Claude Sonnet 4.6 vs DeepSeek V3.2

Why Claude Sonnet 4.6?

Why DeepSeek V3.2?

Favorites

Estimate Complexityreasoning

Claude Sonnet 4.6

DeepSeek V3.2

1. Understand GPT-3's Architecture

2. FLOPs per Forward Pass

3. FLOPs per Backward Pass

4. Total FLOPs per Training Step

5. Total Number of Training Steps

6. Total FLOPs for Training

Final Estimate

AI Board Game Logicreasoning

Claude Sonnet 4.6

DeepSeek V3.2

Logic Puzzlereasoning

Claude Sonnet 4.6

DeepSeek V3.2

Stochastic Consistency Testanalysis

Claude Sonnet 4.6

DeepSeek V3.2

SVG Layout Challengeimage generation

Claude Sonnet 4.6

DeepSeek V3.2

Xbox Controller SVG Artimage generation

Claude Sonnet 4.6

DeepSeek V3.2

Generate a Stand-Up Routineconversation

Claude Sonnet 4.6

"The Algorithm Knows Me Too Well"

DeepSeek V3.2

Realistic AI Interviewconversation

Claude Sonnet 4.6

DeepSeek V3.2

Satirical Fake News Headlineconversation

Claude Sonnet 4.6

DeepSeek V3.2

Character Voice Testconversation

Claude Sonnet 4.6

DeepSeek V3.2

Minimalist Landing Pageweb design

Claude Sonnet 4.6

DeepSeek V3.2

Pokémon Battle UI Recreationweb design

Claude Sonnet 4.6

DeepSeek V3.2

What is the difference between Claude Sonnet 4.6 and DeepSeek V3.2?

Which is better, Claude Sonnet 4.6 or DeepSeek V3.2?

How much does Claude Sonnet 4.6 cost compared to DeepSeek V3.2?

How can I compare Claude Sonnet 4.6 and DeepSeek V3.2 on RIVAL?

Why Claude Sonnet 4.6?

Why DeepSeek V3.2?

Favorites

Estimate Complexityreasoning

Claude Sonnet 4.6

DeepSeek V3.2

1. Understand GPT-3's Architecture

2. FLOPs per Forward Pass

3. FLOPs per Backward Pass

4. Total FLOPs per Training Step

5. Total Number of Training Steps

6. Total FLOPs for Training

Final Estimate

AI Board Game Logicreasoning

Claude Sonnet 4.6

DeepSeek V3.2

Logic Puzzlereasoning

Claude Sonnet 4.6

DeepSeek V3.2

Stochastic Consistency Testanalysis

Claude Sonnet 4.6

DeepSeek V3.2

SVG Layout Challengeimage generation

Claude Sonnet 4.6

DeepSeek V3.2

Xbox Controller SVG Artimage generation

Claude Sonnet 4.6

DeepSeek V3.2