> and you then visited a second website, and it also included the same resource, then the resource would be loaded from the shared cache rather than being downloaded from the internet a second time. Cookies set by these resources would also be shared.
The privacy problem is not a result of the shared dependency, it's a result of the shared cookies.
Yes, if you share the execution space between multiple programs running the same lib, there's a privacy concern.
> The privacy problem is not a result of the shared dependency, it's a result of the shared cookies.
No, it's not the result of the shared cookies. You just ignored all of the timing attacks and fingerprinting which a shared cache allows, as those articles discuss.
The cookie thing is honestly completely irrelevant to the topic of privacy, if you understand how the shared cache used to work. If you loaded the same library from separate CDNs on different websites, the shared cache didn't come into play at all. The library was loaded twice anyways. There was no chance for cookies from different CDNs to accidentally cross the streams. Browsers weren't attempting to heuristically determine if you were trying to load the same asset from different hosts.
The shared cache only came into play if you loaded the same asset from the same third-party CDN on multiple websites. The host serving an asset is the one responsible for setting the cookies, and sharing the cookies is helpful in that case, since the same CDN is the host serving the asset to the browser for both websites. These aren't cookies controlled by separate websites, they're cookies supplied by the CDN, and they should be the same for both requests anyways, outside of maybe some remote possibility of a theoretical attack involving a malicious CDN intentionally setting some weird request-dependent cookies, but I can't see how that would even do anything harmful anyways. So, the cookies get shared because the browser is serving a cached response for the same asset from the same host to both sites, which makes sense.
So, cookies aren't the problem here, and they're not the reason the shared cache was partitioned. If you want browsers to undo that in any form, you would have to solve the actual privacy problems here.
Fingerprinting is possible if you know what's in the cache before you use it.
If I make a request for a given dependency, and am allowed to so much as time how long it takes to resolve, I can detect if it was already there and there's an information leak.
Sure. At some point though, a malicious site's gonna end up making some very weird requests - obviously polling the cache.
You could specify the dependency set in a static context and limit the ability of a site to measure how long dependency resolution takes.
Is there still an information leak? Yes.
Do I think it should stand in the way of a functional internet? Not really.
Google, Apple, and Mozilla apparently all failed to find a solution to this problem, even though lots of people wanted there to be a solution. Certainly, any browser that shipped this feature in a privacy-respecting manner would have bragging rights for awhile, so there is an incentive. Given that, I don't think I'm exaggerating the difficulty of the problem. This problem is also very similar to the challenges brought by Spectre-class vulnerabilities, and that one has been enormously costly for the whole industry.
I think it is completely fair to say that privacy-respecting shared caches are not simple.
Some solutions can be imagined, but they come with weird trade-offs or they do nothing for majority of the web. A new manifest format like you describe falls into the latter, since it would only apply to new websites using the new feature, and that’s without digging into the other problems it would pose.
In practice, people often visit the same websites repeatedly; they aren't constantly visiting new websites only once. A partitioned cache works just fine for the normal scenario. It's slightly less efficient for the first day someone uses their browser, but then things are honestly fine after that. It's unfortunate that we can't eek out the last tiny bit of performance for this, but I think the difference would be hard to measure in practice.
In my opinion, if websites would more commonly use brotli, that would make a far larger difference in efficiency than returning to a shared cache, and if browsers could have a standardized means of downloading only the bytes that changed in an asset like a javascript library instead of downloading the new version from scratch, that would make a much bigger difference too.
I think it is completely fair to say that privacy-respecting shared caches are not simple.
Couldn’t they make an exception for some domains and create a registry of really popular or fundamental links to packages like jquery et al? I have read on this topic before, but it sounded like all or nothing no shades of grey maximalism. Fine, partition those memes from imgur cdns, but let common libraries with known hashes to be shared at least. The potential attack is based on leaving a cdn-pixel and dl-time-testing it on other sites. But there is no big data in who has the 10 most popular releases of wasm-sqlite, dayjs or bootstap.min.css in their cache. These could be warmed up from literally anywhere, or even synced in background by an idle browser thread.
I feel like Google Chrome shipped an experiment at one point that was going to include some of the most popular libraries with the browser, so they would be equally cached for all Chrome users, for all sites. I'm having trouble finding any announcements about this, so maybe I dreamed this up.
Sharing a cache between websites has proven to be a privacy issue.
Read more here: https://developer.chrome.com/en/blog/http-cache-partitioning...
or here: https://www.peakhour.io/blog/cache-partitioning-firefox-chro...