I used to daydream about having a web proxy that could store every page I visited (instead of having to manually save interesting pages... something I do a lot). But I never had the storage space for that, and the bloat in web pages has grown faster than the size of disk I can afford. Since I started using Offpunk for Gemini some time ago I at least get a complete saved record of all the Gemini pages I read. 65 MB in one year. Far more realistic to maintain than with web pages.
> But I never had the storage space for that, and the bloat in web pages has grown faster than the size of disk I can afford.
Is this really true for you? Seems surprising to me given how cheap large HDD storage is now. I have trouble believing even bloated web pages are that large relatively speaking. I guess I should try to do it and find out, probably I’m wrong and it’s much more data than I’m expecting.
Gemini seems really neat, I should have investigated earlier.
ssh kiosk@gemini.circumlunar.space
Let’s one get a sense of what it is about. 1 bookmarks found me some spaces.
Two years of using nearly exclusively offpunk without ever trimming the cache:
2.5G gemini
1.3G gopher
1.5G http
23G https
With the exception of dynamic webpages, every single page I’ve read in those two years is there. With pictures. And with every single page linked by those webpages. And all the pages linked on HN.
That’s very awesome and by itself a compelling argument for one to consider adopting offpunk.
I’d imagine putting the archive in git repo with annex/lfs and along with builtin time-stamping it means an activity archive also. Lot’s of interesting use cases if combined with LLMs and RAG for example.
“A few weeks ago I was researching technology XYZ and there was an open source python package that looked neat, but I can’t remember what it was. Could you use my browsing cache and prepare a report on what I was reading about and summarize it please?”