Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> AI bots

> You can opt into a managed rule that will block bots that we categorize as artificial intelligence (AI) crawlers (“AI Bots”) from visiting your website. Customers may choose to do this to prevent AI-related usage of their content, such as training large language models (LLM).

> CCBot (Common Crawl)

Common Crawl is not an AI bot:

https://commoncrawl.org






The data it collects is used by AI companies, though.



Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: