What is the difference between Claude Sonnet 3.6 (2022-10-22) and Claude Opus 4.6?

Both are developed by Anthropic but target different use cases. Claude Sonnet 3.6 (2022-10-22) has a 200K token context window vs Claude Opus 4.6's 1.0M. You can compare their actual outputs across 42 challenges on RIVAL to see how they differ in practice.

Which is better, Claude Sonnet 3.6 (2022-10-22) or Claude Opus 4.6?

It depends on your use case. Claude Sonnet 3.6 (2022-10-22) and Claude Opus 4.6 each have strengths in different areas. RIVAL lets you compare their real outputs side-by-side across 42 challenges so you can judge which fits your needs best.

How much does Claude Sonnet 3.6 (2022-10-22) cost compared to Claude Opus 4.6?

Claude Sonnet 3.6 (2022-10-22) costs $3/M input tokens and Claude Opus 4.6 costs $5/M input tokens. Claude Sonnet 3.6 (2022-10-22) is $2.00/M cheaper per input. Check their side-by-side outputs on RIVAL to see if the price difference is justified by quality.

How can I compare Claude Sonnet 3.6 (2022-10-22) and Claude Opus 4.6 on RIVAL?

This page shows a side-by-side comparison of Claude Sonnet 3.6 (2022-10-22) and Claude Opus 4.6 across shared challenges. You can vote on which model produced the better output, and Pro users can create custom challenges to test both models with their own prompts.

Rival

Claude Sonnet 3.6 (2022-10-22)vsClaude Opus 4.6

37 challenges

Why Claude Sonnet 3.6 (2022-10-22)?

1.7x cheaper overall$3.00/M in · $15.00/M out

Why Claude Opus 4.6?

5x more context1.0M

Leads 1 of 1 benchmarks

Stronger on SWE-bench Verified80.8% vs 49.0%

20 months newerFeb 2026

Favorites

Claude Sonnet 3.6 (2022-10-22)Claude Opus 4.6

Movie

The Princess Bride

19877.7

2001: A Space Odyssey

19688.1

Album

Kind of Blue

Miles Davis

Kind of Blue

Miles Davis

Book

Братья Карамазовы

Фёдор Михайлович Достоевский

Gödel, Escher, Bach

Douglas R. Hofstadter

City

Tokyo

Japan

Istanbul

Turkey

Game

Portal

Action, Puzzle4.5

The Legend of Zelda: Breath of the Wild

Adventure, Action4.5

12 / 37

Estimate Complexityreasoning

Tests an AI's ability to make educated estimates based on technical knowledge

Claude Sonnet 3.6 (2022-10-22)

Anthropic · Jun 2024text

No content available for this output.

Claude Opus 4.6

Anthropic · Feb 2026text

Estimating FLOPs to Train GPT-3

Key Parameters

Let me recall the key specs of GPT-3:

Parameters: 175 billion (1.75 × 10¹¹)
Training tokens: ~300 billion (3 × 10¹¹)
Architecture: Decoder-only Transformer

The Standard Approximation Rule

There's a well-known approximation for Transformer training FLOPs:

$$C ≈ 6 \times N \times D$$

where:

N = number of parameters
D = number of training tokens
6 comes from the factor of 2 for each of three operations:
- Forward pass: ~2 FLOPs per parameter per token (one multiply + one add per parameter for matrix operations)
- Backward pass (activation gradients): ~2 FLOPs per parameter per token
- Backward pass (weight gradients): ~2 FLOPs per parameter per token

Why 2 FLOPs per parameter per token for the forward pass?

A Transformer is dominated by matrix multiplications. For a matrix multiplication of shape (m × k) times (k × n), the FLOPs are ~2mkn (k multiplies and k-1 adds per output element ≈ 2k ops × mn outputs). When you sum across all the weight matrices in the model, the total forward-pass FLOPs work out to approximately 2 × N per token, where N is the total parameter count.

Calculation

$$C ≈ 6 \times N \times D$$

$$C ≈ 6 \times (1.75 \times 10^{11}) \times (3 \times 10^{11})$$

$$C ≈ 6 \times 5.25 \times 10^{22}$$

$$C ≈ 3.15 \times 10^{23} \text{ FLOPs}$$

Cross-check with reported values

The OpenAI paper actually reported 3.14 × 10²³ FLOPs, so this estimate is essentially exact — which makes sense because the 6ND formula is precisely the approximation the community (and OpenAI themselves) use.

Final Answer

$$\boxed{C \approx 3.15 \times 10^{23} \text{ FLOPs}}$$

This is about 315 ZettaFLOPs — roughly equivalent to running thousands of V100 GPUs for weeks-to-months, which aligns with what we know about the training infrastructure used.

Vote

2 / 37

AI Board Game Logicreasoning

Tests an AI's ability to understand game rules and strategy

Claude Sonnet 3.6 (2022-10-22)

Anthropic · Jun 2024svg

No content available for this output.

Anthropic · Feb 2026text

No content available for this output.

Vote

11 / 37

Minimalist Landing Pageweb design

Tests an AI's ability to generate a complete, working landing page

Claude Sonnet 3.6 (2022-10-22)

Anthropic · Jun 2024website

Claude Opus 4.6

Anthropic · Feb 2026website

Vote

12 / 37

Pokémon Battle UI Recreationweb design

Recreate an interactive, nostalgic Pokémon battle UI in a single HTML file.

Claude Sonnet 3.6 (2022-10-22)

Anthropic · Jun 2024website

No content available for this output.

Claude Opus 4.6

Anthropic · Feb 2026website

No content available for this output.

Vote

12 of 37

FAQ

Continue exploring

Claude Sonnet 3.6 (2022-10-22) vs Llama 4 MaverickNew provider

Claude Sonnet 3.6 (2022-10-22) vs GPT-5New provider

Claude Sonnet 3.6 (2022-10-22) vs MiniMax M2.5New provider

Estimating FLOPs to Train GPT-3

Key Parameters

Let me recall the key specs of GPT-3:

Parameters: 175 billion (1.75 × 10¹¹)
Training tokens: ~300 billion (3 × 10¹¹)
Architecture: Decoder-only Transformer

The Standard Approximation Rule

There's a well-known approximation for Transformer training FLOPs:

$$C ≈ 6 \times N \times D$$

where:

N = number of parameters
D = number of training tokens
6 comes from the factor of 2 for each of three operations:
- Forward pass: ~2 FLOPs per parameter per token (one multiply + one add per parameter for matrix operations)
- Backward pass (activation gradients): ~2 FLOPs per parameter per token
- Backward pass (weight gradients): ~2 FLOPs per parameter per token

Why 2 FLOPs per parameter per token for the forward pass?

Calculation

$$C ≈ 6 \times N \times D$$

$$C ≈ 6 \times (1.75 \times 10^{11}) \times (3 \times 10^{11})$$

$$C ≈ 6 \times 5.25 \times 10^{22}$$

$$C ≈ 3.15 \times 10^{23} \text{ FLOPs}$$

Cross-check with reported values

Final Answer

$$\boxed{C \approx 3.15 \times 10^{23} \text{ FLOPs}}$$

This is about 315 ZettaFLOPs — roughly equivalent to running thousands of V100 GPUs for weeks-to-months, which aligns with what we know about the training infrastructure used.