Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The list of bots is pretty short right now:

https://developers.cloudflare.com/bots/concepts/bot/#ai-bots






> AI bots

> You can opt into a managed rule that will block bots that we categorize as artificial intelligence (AI) crawlers (“AI Bots”) from visiting your website. Customers may choose to do this to prevent AI-related usage of their content, such as training large language models (LLM).

> CCBot (Common Crawl)

Common Crawl is not an AI bot:

https://commoncrawl.org


The data it collects is used by AI companies, though.

Cloudflare sees a lot of the web traffic. I assume these are the biggest bots they're seeing right now, and any new contenders would be added as they find them. Probably impossible to really block everything, but they've got the web-coverage to detect more than most.

They are lying. They cant detect crawlers unless we tell them we are who we are.

Enough to more than half the traffic to most sites if the blocks hold.



Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: