Claude 3.7 Thinking Sonnet

Claude 3.7 Thinking Sonnet

Claude 3.7 Thinking Sonnet exposes the full chain-of-thought process during problem-solving, including error backtracking and alternative solution exploration. Scores 86.1% on GPQA Diamond benchmark for expert-level Q&A.

ConversationReasoningAnalysisSummarization
Provider
Anthropic
Release Date
2025-02-26
Size
LARGE
Parameters
Not disclosed

Benchmark Performance

Performance metrics on industry standard AI benchmarks that measure capabilities across reasoning, knowledge, and specialized tasks.

MMLU

77.1%

GPQA Diamond

84.8%

MATH

96.2%

AIME

80.0%

HellaSwag (10-shot)

89.0%

Model Insights

All Model Responses