@oblomov @ansuz It's even easier than that, and most bots can be caught on the first request: if the user-agent contains Firefox/ or Chrome/, and you're serving on HTTPS, the request will1 contain a sec-fetch-mode header too, when coming from a real browser. Bots don't send it.
Pair it with blocking agents listed in ai.robots.txt, and ~90% of your bot traffic is gone. If you can afford to block Huawei's and Alibaba's ASNs, you pretty much got rid of all of them.
Many of the bots do download CSS, and some even fetch the JS too, by the way. And images? Some of them love 'em.
Exceptions apply: if you put a page in Reader Mode in Firefox, and reload while in reader mode, no sec-fetch-mode is sent. There are also some applications like gnome-podcasts that uses a Firefox user-agent, but doesn't send sec-fetch-mode. While there will be false positives, most of them can be worked around, and the gain of catching all the lame bots far outweights the cons, imo. ↩︎