no, sonnet 3.5 is #7 on LiveBench, even below DeepSeek V3. | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

behnamoh 5 months ago | parent | context | favorite | on: DeepSeek-R1

no, sonnet 3.5 is #7 on LiveBench, even below DeepSeek V3.

thegeomaster 5 months ago [–]

The parent comment was talking about coding specifically, not the average score. I see o1 at 69.69, and Claude 3.5 Sonnet at 67.13.

sebastiennight 5 months ago | [–]

o1's score looks like exactly what I would expect Elon Musk to aim for with Grok's benchmarks

Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact