Claude 3.7 Sonnet

Claude 3.7 Sonnet

Claude 3.7 Sonnet offers Extended Thinking Scaffolds that boost SWE-bench coding accuracy from 62.3% to 70.3%, with 81.2% accuracy in retail automation tasks, outperforming Claude Sonnet 3.6 (2022-10-22) by 13.6%.

ConversationReasoningAnalysisSummarization
Provider
Anthropic
Release Date
2025-02-25
Size
LARGE
Parameters
Not disclosed

Benchmark Performance

Performance metrics on industry standard AI benchmarks that measure capabilities across reasoning, knowledge, and specialized tasks.

MMLU

80.3%

MATH

82.2%

GPQA Diamond

68.0%

SWE-Bench Verified

62.3%

Retail Task Accuracy

81.2%

Airline Task Accuracy

58.4%

Model Insights

All Model Responses