Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My thought too, honoring robots.txt is just a convention. There's no requirement to follow robots.txt, or at least certainly no technical requirement. I don't think there's any automatic legal requirement either.

Maybe sites could add "you must honor policies set in robots.txt" to something like a terms of service but I have no idea if that would have enough teeth for a crawler to give up.






I don't think terms of service are applicable anyway. Terms of Service aren't a signed contract as you may never see it nor know there is one. This happens both in the case of visiting the site interactively or fetching a page programatically.

Cloudflare snd their customera have been desperately for years trying to kill scrapers in court. This is all. Meaningless, but they are probably gearing up for another legal battle to define robots.txt as a legal contract. Theyre going to use this marketplace theyre scamming people with to do it. They will fail.



Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: