> When you enable this feature via a pre-configured managed rule, Cloudflare can detect and block verified AI bots that comply with robots.txt and respect crawl rates, and do not hide their behavior from your website. The rule has also been expanded to include more signatures of AI bots that do not follow the rules.
We already know companies like Perplexity are masking their traffic. I'm sure there's more than meets the eye, but taking this at face value, doesn't punishing respectful and transparent bots only incentivize obfuscation?
edit: This link[0], posted in a comment elsewhere, addresses this question. tldr, obfuscation doesn't work.
> We leverage Cloudflare global signals to calculate our Bot Score, which for AI bots like the one above, reflects that we correctly identify and score them as a “likely bot.”
> When bad actors attempt to crawl websites at scale, they generally use tools and frameworks that we are able to fingerprint. For every fingerprint we see, we use Cloudflare’s network, which sees over 57 million requests per second on average, to understand how much we should trust this fingerprint. To power our models, we compute global aggregates across many signals. Based on these signals, our models were able to appropriately flag traffic from evasive AI bots, like the example mentioned above, as bots.
"doesn't punishing respectful and transparent bots only incentivize obfuscation?"
Sure, but we crossed that bridge over 20 years ago. It's not creating an arms race where there wasn't already one.
Which is my generic response to everyone bringing similar ideas up. "But the bots could just...", yeah, they've been doing it for 20+ years and people have been fighting it for just as long. Not a new problem, not a new set of solutions, no prospect of the arms race ending any time soon, none of this is new.
> The rule has also been expanded to include more signatures of AI bots that do not follow the rules.
The Block AI Bots rule on the Super Bot Fight Mode page does filter out most bot traffic. I was getting 10x the traffic from bots than I was from users.
It definitely doesn't rely on robots.txt or user agent. I had to write a page rule bypass just to let my own tooling work on my website after enabling it.
Cloudflare already knows how to make the web hell for people they don't like.
I read the robots.txt entries as those AI bots that will be not marked as "malicious" and that will have the opportunity to be allowed by websites. The rest will be given the Cloudflare special.
>doesn't punishing respectful and transparent bots only incentivize obfuscation?
They're cloudflare and it's not like it's particularly easy to hide a bot that is scraping large chunks of the Internet from them. On top of the fact that they can fingerprint any of your sneaky usage, large companies have to work with them so I can only assume there are channels of communication where cloudflare can have a little talk with you about your bad behavior. I don't know how often lawyers are involved but I would expect them to be.
edit: This link[0], posted in a comment elsewhere, addresses this question. tldr, obfuscation doesn't work.
[0] https://blog.cloudflare.com/declaring-your-aindependence-blo...