2026-01-08 Shared block lists for the scrapers ============================================== The artificial intelligence bros are scraping the net and they don't care if the small websites fall over or not for days on end. But you already know all that. It's the founding story of my Butlerian Jihad. The setup I use is documented elsewhere. If you have a different setup and would still like to access the list of IP addresses I'm currently blocking, maybe to block them yourself, after careful review… You do carefully review these lists, right? … Right? 😏 Well, in any case, also as a kind of explanation of what is happening, here are some files I'm regenerating every hour: * blocked IPv4 address ranges * blocked IPv6 address ranges * blocked ASN * the script Note that I'm blocking the ASN by looking up all the IP address ranges they are managing and blocking them. So the list of address ranges is the result but the list of ASNs is the intent. #Butlerian_Jihad #Administration 2026-01-23. An interesting alternative: > While reviewing the access and error logs to ensure things were > working as expected, two things stood out: Most of my traffic was > coming over HTTP/1.X (>80%). Almost all HTTP/1.X traffic was bad > (e.g., basic attacks, bad bots, scrapers, etc.). – Selectively > Disabling HTTP/1.0 and HTTP/1.1, by Mark McBride With explicit exceptions for known non-standard browsers.