More

edding4500 · 2025-06-30T12:27:27 1751286447

https://docs.n2api.io/

A unified API for online advertising. Think Plaid for ads platforms instead of banks.

edding4500 · 2025-05-24T05:38:41 1748065121

I wrote my bachelors thesis about IDE support for lexical effects and handlers: https://se.cs.uni-tuebingen.de/teaching/thesis/2021/11/01/ID...

All of what you state is very doable.

edding4500 · 2025-04-30T11:05:59 1746011159

This is silly. Behind an LLM sits a deterministic algorithm. So no, it is not possible without ibserting randomness by other means into the algo, for example by setting temperatures for gradient descent.

Why are all these posts and news about LLMs so uninformed? This is human built technology. You can actually read up how these things work. And yet they are treated as if it were an alien species that must be examined by sociological means and methods where it is not necessary. Grinds my gears every time :D

whoami_nr · 2025-04-30T12:25:52 1746015952

Author here. I know it’s silly. I understand to some extent how they work. I was just doing this for fun. Took about 1hr for everything and it all started when a friend asked me whether we can use them for a coin toss.

edding4500 · 2025-04-30T15:53:34 1746028414

Sorry, I did not mean to downtalk the blog post :) I did not mean silly as in stupid. It's rather the title that I think is misleading. Can a LLM do randomness? Well, PRNGs are part of it so the question boils down whether PRNGs can do randomness. As mentioned here before, setting the temperature of say GPT-2 to zero makes the output deterministic. I was 99% sure that you as the author knew about this :)

alew1 · 2025-04-30T11:19:31 1746011971

The algorithms are not deterministic: they output a probability distribution over next tokens, which is then sampled. That’s why clicking “retry” gives you a different answer. An LM could easily (in principle) compute a 50/50 distribution when asked to flip a coin.

aeonik · 2025-04-30T11:31:54 1746012714

They are still deterministic. You can set temperature to zero to get the output to be consistent, but even the temperature usually uses a seed or psuedo random number generator. Though this would depend on the implementation.

https://github.com/huggingface/transformers/blob/d538293f62f...

dist-epoch · 2025-04-30T12:03:45 1746014625

As someone which tried really hard to get deterministic outcome out of them, they really are not.

Layers can be computed in slightly different orders (due to parallelism), on different GPU models, and this will cause small numerical differences which will compound due to auto-regression.

delusional · 2025-04-30T14:47:13 1746024433

Could someone elighten me on how to compute layers in parallel? I was under the impression that the linearity of the layer computation was why we were mostly bandwidth constrained. If you can compute the layers In parallel then why do we need high bandwidth?

dist-epoch · 2025-04-30T20:16:17 1746044177

https://developer.nvidia.com/blog/mastering-llm-techniques-i...

throwawaymaths · 2025-04-30T12:09:47 1746014987

all things being equal, if you fix all of those things and the hardware isn't buggy, you get the same results, and I've set up CI with golden values that requires this to be true. indeed, occasionally you have to change golden values depending on implementation but mathematically the algorithm is deterministic, even if in practice determinidm requires a bit more effort.

dkersten · 2025-04-30T12:28:51 1746016131

But the reality is that all things aren’t equal and you can’t fix all of those things, not in a way that is practical. You’d have to run everything serially (or at least in a way you can guarantee identical order) and likely emulated so you can guarantee identical precision and operations. You’ll be waiting a long time for results.

Sure, it’s theoretically deterministic, but so are many natural processes like air pressure, or the three body problem, or nuclear decay, if only we had all the inputs and fixed all the variables, but the reality is that we can’t and it’s not particularly useful to say that well if we could it’d be deterministic.

orbital-decay · 2025-04-30T12:59:23 1746017963

It's definitely reachable in practice. Gemini 2.0 Flash is 100% deterministic at temperature 0, for example. I guess it's due to the TPU hardware (but then why other Gemini models are not like that...).

throwawaymaths · 2025-04-30T14:03:34 1746021814

Anyways, this is all immaterial to the original question, which is if LLMs can do randomness [for single user with a given query], so from a practical standpoint the question itself needs to survive "all things being equal", that is is to say, suppose I stand up an LLM on my own GPU rig, and the algorithmic scheduler doesn't do too many out of order operations (very possible depending on the ollama or vllm build).

orbital-decay · 2025-04-30T12:02:34 1746014554

Setting the temperature to zero reduces the process to greedy search, which does a lot more things to the output than just making it non-random.

im3w1l · 2025-04-30T11:50:17 1746013817

Yes so it's basically asking whether that probability distribution is 50/50 or not. And it turns out that it's sometimes very skewed. Which is a non-obvious result.

kurikuri · 2025-04-30T12:38:27 1746016707

So, what ‘algorithms’ are you talking about? The randomness comes from the input value (the random seed). Once you give it a random seed, a special number generator (PRNG) makes a sequence from that seed. When the LLM needs to ‘flip a coin,’ it just consumes a value from the PRNG’s output sequence.

Think of each new ‘interaction’ with the LLM as having two things that can change: the context and the PRNG state. We can also think of the PRNG state as having two things: the random seed (which makes the output sequence), and the index of the last consumed random value from the PRNG. If the context, random seed, and index are the same, then the LLM will always give the same answer. Just to be clear, the only ‘randomness’ in these state values comes from the random seed itself.

The LLM doesn’t make any randomness, it needs randomness as an input (hyper)parameter.

orbital-decay · 2025-04-30T13:40:20 1746020420

The raw output of a transformer model is a list of logits, confidence scores for each token in its vocabulary. It's only deterministic in this sense (same input = same scores). But it can easily assign equal scores to 1 and 0 and zero to other tokens, and you'll have to sample it randomly to produce the result. Whether you consider it external or internal doesn't matter, transformers are inherently probabilistic by design. Randomness is all they produce. And typically they aren't trained with the case of temperature 0 and greedy sampling in mind.

kurikuri · 2025-05-10T18:54:45 1746903285

> But it can easily assign equal scores to 1 and 0 and zero to other tokens, and you’ll have to sample it randomly to produce the result. Whether you consider it external or internal doesn’t matter, transformers are inherently probabilistic by design.

The transformer is operating on the probability functions in a fully deterministic fashion, you might be missing the forest for the trees here. In your hypothetical, the transformer does not have a non-deterministic way of selecting the 1 or 0 token, so it will rely on a noise source which can. It does not produce any randomness at all.

orbital-decay · 2025-05-22T06:37:14 1747895834

It's one way to look at it, but consider that you need the noise source in case 1 and 0 are strictly equal, necessarily. You can't tell which one is the answer until you decided randomly.

kurikuri · 2025-05-23T04:53:22 1747976002

Right, so the LLM needs some randomness to make that decision. The LLM performs a series of deterministic operations until it needs the randomness to make this decisions, there is no randomness within the LLM itself.

kbelder · 2025-04-30T17:34:28 1746034468

But the randomness doesn't directly translate to a random outcome in results. It may randomly choose from a thousand possible choices, where 90% of the choices are some variant of 'the coin comes up heads'.

I think a more useful approach is to give the LLM access to an api that returns a random number, and let it ask for one during response formulation, when needed.

throwawaymaths · 2025-04-30T12:05:39 1746014739

i think gp would consider the sampling bit a part of the API, not a part of the algorithm.

kerkeslager · 2025-04-30T11:29:31 1746012571

The algorithms are definitely not deterministic. That said I agree with your general point that experimenting on LLMs as if they're black boxes with unknown internals is silly.

EDIT: I'm seeing another poster saying "Deterministic with a random seed?" That's a good point--all the non-determinism comes from the seed, which isn't particularly critical to the algorithm. One could easily make an LLM deterministic by simply always using the same seed.

dist-epoch · 2025-04-30T12:07:17 1746014837

> all the non-determinism comes from the seed

not fully true, when using floating point the order of operations matters, and it can vary slightly due to parallelism. I've seen LLMs return different outputs with the same seed.

onionisafruit · 2025-04-30T12:21:39 1746015699

That’s an interesting observation. Usually we try to control that, but with LLMs the non-determinism is fine.

It seems like that would make it hard to unit test LLM code, but they seem to be managing.

kerkeslager · 2025-05-01T01:59:41 1746064781

Oh, that's really interesting. Hadn't thought of that.

_joel · 2025-04-30T11:23:10 1746012190

Deterministic with a random seed?

edding4500 · 2025-04-30T15:54:51 1746028491

But then the random seed is the source of randomness and not the training data. So the question "Can LLMs do randomness" would actually boil down to "Can PRNGs do randomness".

chaoz_ · 2025-04-30T11:57:21 1746014241

"You can actually read up on how these things work."

While you can definitely read about how some parts of a very complex neural network function, it's very challenging to understand the underlying patterns.

That's why even the people who invented components of these networks still invest in areas like mechanistic interpretability, trying to develop a model of how these systems actually operate. See https://www.transformer-circuits.pub/2022/mech-interp-essay (Chris Olah)

kaibee · 2025-04-30T12:18:27 1746015507

Yes, but sometimes asking dumb questions is the first step to asking smart questions. And OP's investigation does raise some questions to me at least.

1. Give a model a context with some # of actually random numbers and then ask it to generate the next random number. How random is that number? Repeat N times, graph the results, is there anything interesting about the results?

2. I remember reading about how brains/etc are kinda edge-balanced chaotic systems. So if a model is bad at outputting random numbers (ie: needs a very high temperature for the experiment from step 1 to produce a good distribution of random numbers) What if anything does that tell us about the model?

3. Can we add a training step/fine-tuning step that makes the model better at the experiment from step #2? What effect does that have on its benchmarks?

I'm not an ML researcher, so maybe this is still nonsense.

edding4500 · 2025-04-18T15:51:41 1744991501

Among them were jewish nationalists like Fritz Haber. He was allegedly a very proud german. Didnt help him though.

klipt · 2025-04-18T16:17:22 1744993042

He saw the writing on the wall enough to quit his job in Germany and move his children to the UK.

cenamus · 2025-04-18T17:25:28 1744997128

Also a big fan (and basically the father) of gas warfare

eternauta3k · 2025-04-19T04:26:11 1745036771

Also saved the world from starvation. And provided Germany with synthetic fuels which prolonged the war. Bit of a mixed bag.

https://www.amazon.de/-/en/Alchemy-Air-Jewish-Scientific-Dis...

edding4500 · 2025-02-21T16:02:49 1740153769

So basically a Zettelkasten?

Tomte · 2025-02-21T16:37:07 1740155827

No. It has practically nothing to do with Zettelkasten. For starters, Johnny Decimal says nothing about links.

edding4500 · 2025-02-21T21:06:53 1740172013

Uh, you are right, that was a silly comment I made.

edding4500 · 2025-02-10T10:37:32 1739183852

I sometimes feel like blogging in the developer world has become something that you copy from the great masters of the craft. Many famous devs blog about the craft. Same in the tech entrepreneur bubble. i remember times when blogging was as casual as having an instagram account. Then there were times were you would blog stuff that could potentially be helpful to others. Like a problem with configuring our linux audio device that ou were finall able to fix. For all these topics there are now established communities. The blogosphere is now divided into company blogs (for SEO, maybe also to tell a story or amuse people), and peesonal blogs. And of these personal blogs, some try to copy their role models in writing and thinking (a good way to become inauthentic and lose touch with yourself) and a minority manages to write whatever comes to their mind, no matter how awesome it reads and how many people will like the idea that is transported.

edding4500 · 2025-02-09T21:43:31 1739137411

> In some sense, AGI is just another tool in this ever-taller scaffolding of human progress we are building together. In another sense, it is the beginning of something for which it’s hard not to say “this time it’s different”; the economic growth in front of us looks astonishing, and we can now imagine a world where we cure all diseases, have much more time to enjoy with our families, and can fully realize our creative potential

yeah, right. What world is this guy living in? An idealistic one? Will AI equally spread the profits of that economic growth he is talking about? I only see companies getting by on less menpower and doing fine, while poor people stay poor. Bravo. Well thought trhough, guy who now sells "AI".

edding4500 · 2024-12-13T09:06:27 1734080787

Mad props for offering a pump kit!

edding4500 · 2024-11-14T10:13:42 1731579222

That! Moving your legs to avoid sitting all day seems fine but putting yourself in an unergonomic position like on a road bike seems counter productive. Your neck and lower back will probably tighten up badly over time.

edding4500 · 2024-10-23T11:41:42 1729683702

I love the page. I find it very inspiring :)