Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think its 100% ok to freely train on public internet data.

What is absolutely not ok is to crawl at such an excessive speed that it makes it difficult to host small scale websites.

Truly a tragedy of the commons.






Agree. The problem lately is that even if each single scraper is doing so “reasonably,” there are so many individuals and groups doing this that it’s still too onerous for many sites. And of course many are not “reasonable.”

This is the attitude that's going to kill the public internet. Because you're right, it is a free for all right now with the only way to opt out being putting content behind restricted platforms.



Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: