Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
One million (small web) screenshots (nry.me)
162 points by squidhunter 1 day ago | hide | past | favorite | 22 comments




I found my own blog [0]. But interestingly, it is missing the letter I in screenshot starting from July 2025.

[0]: https://onemillionscreenshots.com/idiallo.com/screenshot


If it was unclear, OP's site is screenshots.nry.me, not onemillionscreenshots.com

There are many patches of almost-identical sites.

Some of them are due to many people using the same theme.

Some of them are expired or parked domains, which I reckon should be detected and excluded.


Yeah, I wonder why parked domains are included. Are there not at least 1 million actual websites?

Yeah those clusters are interesting. They stand out, so they are the first thing I zoomed in on, then I realized they're all just stock resume sites. Quickly realize the clusters are something to avoid. Turns out to be an effective visualization method.

The thing I find interesting is where the grouping is robust to colour variations: one of the bigger groups is around 25% from left, 20% from bottom, all one theme but in a wide variety of colours.

>Some of them are due to many people using the same theme.

Teeming masses of sites using what probably seems to the authors as a fresh, unconventional look but ends up being Yet Another.


I doubt anyone selecting a popular theme is confused by the fact that it’s popular. I use the default Mediawiki theme for mine, for instance.

I found my own site, as well, and I found that particularly charming.

Recently went back [0] to the open web and feel like this inclusion alone justified that move.

Thanks for sharing. Humble and heart-warming way to end 2025 for an old Internet man.

[0]: https://frankycaron.medium.com/of-an-open-web-rebirth-and-bi...


I’m curious how the choice of which blog is located next to which was made. The writeup mentions “dimensionality”. I found my blog, and the eight surrounding it are interesting people, but every one of them is an AI researcher with degrees from Berkeley or similar, and the sites are predominantly CVs.

Luminous company but not my level, nor is my blog about AI, nor is it a CV. I can’t see any reason for the location.


I think it is literally by the colors of the screenshots. Nothing to do with the contents.

> I just want to encode the high level aesthetic details of webpage screenshots. Because of this, I fell back on an old friend: the triplet loss on top of a small encoder. The resulting output dimension of 64 afforded ample room for describing the visual range while maintaining a considerably smaller footprint.


So good to see this different approach! The clustering looks really cool and love that the focus is not on the most popular websites.

Here’s another Christmassy alternative: https://display.archive.org/xmas

I’m one of the makers of OneMillionScreenshots.com and I’m currently working on an update to it.


I started by finding my own blog and scrolling north, south, east, and west to see my neighbours. I’ve already found several interesting sites and a new person to follow on mastodon.

It’s a shame there doesn’t seem to be any way to link to a particular position on the map but great stuff nevertheless.


Oh wow, my little profile/blog is on here, nice!

That's a lot of fun to explore. I'm not entirely convinced by the "you can judge a book by its cover" thing, there are so many "Hi, I'm _____" pages that might have content or might just be portfolio stubs.

Maybe can add a timeline and clock

Timeline: view older versions

Clock: view light/dark mode theme according to user time zone (or enable dark/light mode manually)

I'm also a bit curious, since most web pages are predominantly white, how many of them are adapted to dark mode?


i didn't know about onemillioscreenshots before but..

this is one of the coolest blogs i have ever read!


Very surprised to see my website on there. But I'm assuming it's >6 months old because I went batshit crazy on the UI recently.

Fyi the link in your HN profile 404s (but the website looks nice, good work!)

Thank you so much!

Nice catch, fixing :D


Shit, my blog is on there. I should post on it more frequently than once every two years.

My forum isn't, though. With a post every day or so and nearly 50 active users, it's probably not "small web" any more :-D


Long tail on the internet is truly long.

My website [1] gets perhaps as many as 200 visitors a week according to Cloudflare. And it's still there at number 399322 (first half of the pack).

[1] https://onemillionscreenshots.com/dmitriid.com




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: