There's really two problems as I understand it: - Overcommit. Linux will "overco...

fc417fc802 · on Feb 18, 2025

> two problems ... overcommit

Is there any other sensible way to do this though? It would be quite inefficient to constantly call mmap for additional small(ish) pieces of memory. In effect overcommit just means that until the page is actually written to it hasn't really been allocated. (Aside: I believe a malloc implementation that zero'd out blocks on allocation would fail abruptly rather than later in case that happens to be what bugs you about it.)

Additionally how do you suppose fork should be implemented efficiently? Currently it performs copy-on-write. At minimum you'd need a way to mark pages as "never going to write to these, don't reserve space for a copy". Except such an API is either very awkward to use in practice or else leaves you with some very awkward edge cases to deal with in your program logic.

> You can disable overcommit, but some software will not function very well

Yeah about that.

Chromium runs (AFAIK) 1 PID namespace per tab. On my machine right now it reports 1.1 TiB virtual memory with a little over 100 MiB resident per tab. 1.1 TiB mapped PER TAB. Of the resident I have no idea how much is actually unique (ie written to following the initial fork).

Firefox is much more reasonable at a mere 18 GiB mapped per PID.

nolist_policy · on Feb 18, 2025

> Chromium runs (AFAIK) 1 PID namespace per tab. On my machine right now it reports 1.1 TiB virtual memory with a little over 100 MiB resident per tab. 1.1 TiB mapped PER TAB. Of the resident I have no idea how much is actually unique (ie written to following the initial fork).

This is most likely a trick for garbage collection or memory bug hardening or both. Haskell programs also map 1tb.

jchw · on Feb 18, 2025

A potential workaround would be to still allow giant mmaps but not hang a program when it runs out of pages and instead send a signal to it. Obviously, neither Chrome nor Firefox actually use this much memory in practice.

fc417fc802 · on Feb 19, 2025

Rather than a workaround I think that would just be an overall better approach. Receive an actionable error when the allocation happens "for real", whether that's at an arbitrary point in user code or when malloc zeros out the block ahead of time.

However I think you'd need per-thread signal handlers for that to work sensibly. Which the kernel supports (see man 2 clone) but would require updates to (at least) posix and glibc.

It would probably also be nice to have a way to allocate pages without writing to them. Currently we have mlock but that prevents swapping which isn't desirable in this context.