Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
behnamoh
5 months ago
|
parent
|
context
|
favorite
| on:
DeepSeek-R1
no, sonnet 3.5 is #7 on LiveBench, even below DeepSeek V3.
thegeomaster
5 months ago
[–]
The parent comment was talking about coding specifically, not the average score. I see o1 at 69.69, and Claude 3.5 Sonnet at 67.13.
sebastiennight
5 months ago
|
parent
[–]
o1's score looks like exactly what I would expect Elon Musk to aim for with Grok's benchmarks
Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: