Sample size of 1 but GPT-5 seems horrendous at coding?
My go to benchmark is a 3d snake game Claude does almost flawlessly (or at least in 3-4 iterations)
The prompt:
write a 3d snake game in js and html. you can use any libraries you want. the game still happens inside a single plane, left arrow turns the snake left, right arrow turns it right. the plane is black and there's a green grid. there are multiple rewards of random colors at a given time. each time a reward is eaten, it becomes the snake's new head. The camera follows the snake's head, it is above an a bit behind it, looking forward. When the snake moves right or left, the camera follows gradually left or right, no snap movements. write everything in a single html file.
EDIT: I'm not trying to shit on GPT-5, so many people here seem to be getting very good results, am I doing something wrong with my prompt?
Thanks, the issue was indeed not using explicitly the thinking model or they changed something over the weekend -- it's at least on par with Claude now.
EDIT: clearly better than Claude or any other model that I tried before.
I had a bonus benchmark -- add a narrow triangle on the head of the snake that indicates the direction of movement, after a single iteration GPT-5 fixed it whereas Claude could never get the rotation of the triangle right, nor could o3 the last time I tried.
My go to benchmark is a 3d snake game Claude does almost flawlessly (or at least in 3-4 iterations)
The prompt:
write a 3d snake game in js and html. you can use any libraries you want. the game still happens inside a single plane, left arrow turns the snake left, right arrow turns it right. the plane is black and there's a green grid. there are multiple rewards of random colors at a given time. each time a reward is eaten, it becomes the snake's new head. The camera follows the snake's head, it is above an a bit behind it, looking forward. When the snake moves right or left, the camera follows gradually left or right, no snap movements. write everything in a single html file.
EDIT: I'm not trying to shit on GPT-5, so many people here seem to be getting very good results, am I doing something wrong with my prompt?