Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think lots of people misunderstand that the "non-deterministic" nature of LLMs come from sampling the token distribution, not from the model itself.





It's also the way the model runs. Setting temperature to zero and picking a fixed seed would ideally result in deterministic output from the sampler, but in parallel execution of matrix arithmetic (eg using a GPU) the order of floating point operations starts to matter, so timing differences can produce different results.

Good point. Though sampling generally happens on the CPU in a linear way. What you describe might influence the raw output logits from a single LLM step, but since the differences are only tiny, a well designed sampler could still make the output deterministic (so same seed = same text output). With a very high temperature these small differences might influence the output though, since the ranking of two tokens might be swapped.

I think the usual misconception is to think that LLM outputs are random "by default". IMHO this apparent randomness is more of a feature rather than a bug, but that may be a different conversation.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: