2025-06-16 Ban autonomous systems ================================= More people have been working on blocking whole ranges of IP numbers, since that catches hosting providers that give bots access to the whole range they control. The bots switch IP numbers all the time so a filter based on IP numbers won't catch them. But if we can determine their autonomous system number (ASN), we can not only block an IP number range, we can block all the IP number rangers the ASN controls. Now, since these hosting providers also host nice things like other fediverse instances, I don't want to block them forever. I want to block them for 10min, and if they continue after a few of these shorter blocks, I want to block them for a week. Hopefully, their clients have ended their Internet slurping and things are back to normal. This is how fail2ban works, but only for individual IP numbers. I want code that bridges this gap. #Administration #Butlerian_Jihad #fail2ban Where to start -------------- fail2ban-bloc tries to guess (!) IP ranges and bans those using fail2ban. I need to investigate more. I'm still fascinated by asncounter. It might even work without logfiles, using tcpdump! For now, it generates an interesting Top 10 list. What's interesting about it is that it doesn't require me to query an external service and leak IP numbers. I used to have the following fish function, for example: function asn for ip in $argv dig +short (string split '.' $ip|tac -|string join '.').origin.asn.cymru.com. TXT end end What I don't want to do is write another tool like network-lookup-lean which uses asn.routeviews.org for the same purposes, caches results, and so on. I want to get rid of this live lookup. Working with asncounter ------------- Here's me looking at the last Apache log file, excluding my fedi instance: awk '!/^social/ {print $2}' /var/log/apache2/access.log | asncounter --no-prefixes INFO: using datfile ipasn_20250616.1200.dat.gz INFO: collecting addresses from INFO: loading datfile /root/.cache/pyasn/ipasn_20250616.1200.dat.gz... INFO: finished reading data INFO: loading /root/.cache/pyasn/asnames.json count percent ASN AS 9264 9.49 29691 NINE, CH 6776 6.94 45899 VNPT-AS-VN VNPT Corp, VN 4207 4.31 7922 COMCAST-7922, US 3728 3.82 7018 ATT-INTERNET4, US 2193 2.25 24940 HETZNER-AS, DE 2015 2.06 13030 INIT7, CH 1802 1.85 396982 GOOGLE-CLOUD-PLATFORM, US 1470 1.51 701 UUNET, US 1364 1.4 136907 HWCLOUDS-AS-AP HUAWEI CLOUDS, HK 1257 1.29 32934 FACEBOOK, US total: 97657 INIT7 is my Internet service provider at home and NINE is my hosting provider for the server. Better not ban those! 😅 So what is VNPT-AS-VN VNPT Corp doing? This could use better tool support! grep '2001:ee0:4f' /var/log/apache2/access.log | awk '{print $8}' | sort | uniq -c | head 2 /c2-search?url=http%3A%2F%2Fwiki.c2.com%2F%3Fsearch%3D%22OpenSourceSecondLife%22 1 /cgi-bin/wiki.pl?ErcReplace 1 /cw-fr/BarneySock 1 /edit/2011-06-16_Session_Reports_Are_Read_Just_Once,_If_At_All 1 /edit/2019-03-15_Dungeon_Master%E2%80%99s_Handbook 1 /emacs/AcrobatReader 1 /emacs?action=admin;id=AssociationList 1 /emacs?action=admin&id=Comments_on_AdamShand 1 /emacs?action=admin&id=Comments_on_Categor%C3%ADaRegi%C3%B3n 1 /emacs?action=admin&id=Comments_on_nickat OK, this is bots. Useless random URLs. Ban all the networks managed by an ASN -------------------------------------- I'm going to use ipset to use two lists, banlist and banlist6. I use these two for ban-cidr, too. # Use hash:net because of the CIDR stuff ipset create banlist hash:net iptables -I INPUT -m set --match-set banlist src -j DROP iptables -I FORWARD -m set --match-set banlist src -j DROP ipset create banlist6 hash:net family inet6 ip6tables -I INPUT -m set --match-set banlist6 src -j DROP ip6tables -I FORWARD -m set --match-set banlist6 src -j DROP To ban all the IP ranges an ASN manages, I created the following little fish function using ip.guide: function asn-ban for asn in $argv for cidr in (curl -sL "https://ip.guide/as$asn" | jq --raw-output '.routes.v4[]') echo ipset add banlist $cidr end for cidr in (curl -sL "https://ip.guide/as$asn" | jq --raw-output '.routes.v6[]') echo ipset add banlist6 $cidr end end end Let's try it with the ASN 45899! asn-ban 45899 | sh netfilter-persistent save For more about netfilter-persistent save see the comments on 2025-01-23 The bots are at it again. When I ran the asn-ban command above, I noticed that I got a single "it's already added" response. Before adding the same numbers to my shell script, therefore: for cidr in (asn-ban 45899|awk '{print $4}'); if grep -q $cidr bin/admin/ban-cidr; echo $cidr; end; end That told me I had to remove 14.187.96.0/20 from my script. Once this is done: echo (echo "#"; date --iso) >> bin/admin/ban-cidr asn-ban 45899 >> bin/admin/ban-cidr I really need to figure out how to manage this smartly. And I need to figure out a way to unban the whole list! Integration with fail2ban ----------------- Let's start with fail2ban. I need a jail! Every jail needs a filter! In /etc/fail2ban/jail.d/alex.conf (this is where I maintain all my jails) I added: [butlerian-jihad] enabled = true bantime = 1d Note that this jail doesn't define log paths. I hope that works as intended. I created a matching filter with no definition in /etc/fail2ban/filter.d/butlerian-jihad.conf: # Author: Alex Schroeder [Definition] Reload it all, and check: fail2ban-client reload OK fail2ban-client status Status |- Number of jail: 6 `- Jail list: alex-apache, alex-bots, butlerian-jihad, ngircd, recidive, sshd Nice! So now I have a new jail. Undo the banlist ---------------- asn-ban 45899 | sed 's/ipset add/ipset del/' | sh I also manually edited my ban-cidr file to remove the lines I added above. Let's have fail2ban handle this! Switch from ipset to fail2ban-client ------------ function asn-ban for asn in $argv set --local cidr (curl -sL "https://ip.guide/as$asn" | jq --raw-output '.routes.v4[],.routes.v6[]') echo fail2ban-client set butlerian-jihad banip $cidr end end Examine it: asn-ban 45899 | less Run it: asn-ban 45899 | sh 3640 If you messed up, clear the jail: fail2ban-client reload --unban butlerian-jihad Check the jail: fail2ban-client get butlerian-jihad banned Count the entries in the jail: fail2ban-client get butlerian-jihad banned | sed 's/\'/"/g' | jq length 3640 What do we have? ---------------- With asncounter we have a tool to quickly discover if an ASN is providing services to a bot. With asn-ban we have a tool to quickly add all the IP networks the ASN is managing to a jail for fail2ban. The jail which we called butlerian-jihad bans the IP networks for a day. What's left to do? ------------------ I should check whether this actually works! Let's see whether the ban gets lifted after 24h. That's the main point of this exercise! asn-ban uses the ip.guide site for the data. This should be rewritten such that it uses the same data as asncounter. I guess that would be pyasn. See below! I need a cron job that runs every 10 minutes, takes the last ten minutes worth of Apache access log files, ignores the fedi subdomain, identifies all the ASNs, ignores my own ASNs and bans the rest. Some bans --------- Wow, some of the autonomous systems are big. These are the ones I banned yesterday and today: # AMAZON-02, US (18772!) asn-ban 16509|sh # VNPT-AS-VN VNPT Corp, VN (3640!) asn-ban 45899 | sh # TENCENT-NET-AP Shenzhen Tencent Computer Systems Company Limited, CN (2278!) asn-ban 45090|sh # ALIBABA-CN-NET Alibaba US Technology Co., Ltd., CN (852!) asn-ban 45102 | sh # FACEBOOK, US (541!) asn-ban 32934|sh # SEMRUSH-AS, CY (5!) asn-ban 209366|sh Using pyasn data files from the command-line ------ How to determine the name of an autonomous system number: jq --raw-output '.["32934"]' .cache/pyasn/asnames.json FACEBOOK, US How to determine the networks for an ASN: zgrep '209366$' .cache/pyasn/ipasn_20250616.1200.dat.gz | awk '{print $1}' 85.208.96.0/24 85.208.97.0/24 85.208.99.0/24 185.170.167.0/24 185.191.171.0/24 How to determine the ASN of a CIDR: zgrep '^85\.208\.96\.0/24' .cache/pyasn/ipasn_20250616.1200.dat.gz | awk '{print $2}' 209366 ASN networks without an external service ---------------------------------------- asn-networks is a tiny script with a bunch of lines taken from asncounter to print the IP ranges managed by one or more autonomous systems. python3 asn-ban 209366 185.170.167.0/24 185.191.171.0/24 85.208.96.0/24 85.208.97.0/24 85.208.99.0/24 It uses the pyasn datafiles that a regular run of asncounter has downloaded. That is to say, asn-networks does not download or refresh these files. I'm assuming that you have run asncounter just moments earlier. Given this script, we can now call fail2ban-client as follows (I use fish) to ban all the networks: fail2ban-client set butlerian-jihad banip (asn-networks 209366) 5 Unbanning works the same way: fail2ban-client set butlerian-jihad unbanip (asn-networks 209366) 5 Remember that fail2ban-client prints the number of IP numbers or ranges added or removed. Identifying suspicious ASN -------------------------- What is suspicious activity? How about this: In a 2h window, no ASN should send more than 1000 requests? So we need a script that filters the log files and prints a 2h window, skipping the lines we want to ignore: 2h-access-log. Then pass the IP numbers to asncounter, throw away all the things we don't care about and just print the appropriate lines: bin/2h-access-log !^social \ | awk '{print $2}' \ | bin/asncounter --no-prefixes 2>/dev/null \ | awk '/^[0-9]/ && $1>1000 { print }' 3062 31.93 24940 HETZNER-AS, DE 1642 17.12 16276 OVH, FR So do I dare ban those numbers?? I'm not sure! I should figure out a way to find those 3062 requests made by services hosted on Hetzner. asn-access-log does just that. You pass it an ASN, it determines all the networks it manages and then it filters standard input, assuming that it consists of Apache access log lines (what counts is that the second field is an IP number). bin/2h-access-log !^social | bin/asn-access-log 24940 I see a lot of RSS services (NewsBlur, fiperbot, MyNewspaper Agent, FreshRSS), git, some bot (from the 159.69.0.0/16 range, for example), and on and on. Ugh. It's not easy to know what to do! I think the best answer would be to lower the stakes but also ban for shorter amounts of time and let fail2ban handle the rest. The only thing I need to consider is whether I find the current amount of resources spent OK. Do I? Let's look at the latest numbers. This here shows that fedi traffic is 60% Hetzner and OVH. This makes it hard for me to block these autonomous systems. bin/2h-access-log ^social !178.209.50.237 \ | awk '{print $2}' \ | bin/asncounter --no-prefixes INFO: using datfile ipasn_20250616.1200.dat.gz INFO: collecting addresses from INFO: loading datfile /root/.cache/pyasn/ipasn_20250616.1200.dat.gz... INFO: finished reading data INFO: loading /root/.cache/pyasn/asnames.json count percent ASN AS 2148 45.36 24940 HETZNER-AS, DE 738 15.59 16276 OVH, FR 273 5.77 14061 DIGITALOCEAN-ASN, US 202 4.27 14361 HOPONE-GLOBAL, US 195 4.12 15796 SALT-, CH 105 2.22 214640 HOSTUP HOSTUP, SE 102 2.15 63949 AKAMAI-LINODE-AP Akamai Connected Cloud, SG 62 1.31 47692 NESSUS, AT 59 1.25 197540 NETCUP-AS netcup GmbH, DE 50 1.06 44684 MYTHIC Mythic Beasts Ltd, GB total: 4735 What's the situation without fedi traffic, keeping in mind that I will most likely not be able to block fedi hosters? bin/2h-access-log !^social !178.209.50.237 \ | awk '{print $2}' \ | bin/asncounter --no-prefixes INFO: using datfile ipasn_20250616.1200.dat.gz INFO: collecting addresses from INFO: loading datfile /root/.cache/pyasn/ipasn_20250616.1200.dat.gz... INFO: finished reading data INFO: loading /root/.cache/pyasn/asnames.json count percent ASN AS 249 5.47 7922 COMCAST-7922, US 189 4.16 9808 CHINAMOBILE-CN China Mobile Communications Group Co., Ltd., CN 129 2.84 7018 ATT-INTERNET4, US 122 2.68 396982 GOOGLE-CLOUD-PLATFORM, US 118 2.59 24940 HETZNER-AS, DE 96 2.11 55836 RELIANCEJIO-IN Reliance Jio Infocomm Limited, IN 96 2.11 56046 CMNET-JIANGSU-AP China Mobile communications corporation, CN 75 1.65 140061 CHINANET-QINGHAI-AS-AP Qinghai Telecom, CN 73 1.61 4837 CHINA169-BACKBONE CHINA UNICOM China169 Backbone, CN 70 1.54 701 UUNET, US total: 4548 The autonomous systems that show up in the second list but not in the first list are my prime candidates, like COMCAST and CHINAMOBILE-CN. So how about going after the autonomous systems on the second list that produce more than 1000 hits in a 2h period. Something like this? I'm going to but this into /etc/cron.daily/butlerian-jihad #!/bin/sh bin/2h-access-log !^social !178.209.50.237 \ | awk '{print $2}' \ | bin/asncounter --no-prefixes 2>/dev/null \ | awk '/^[0-9]/ && $1>1000 { print $3 }' \ | ifne xargs bin/asn-networks \ | ifne xargs echo fail2ban-client set butlerian-jihad banip I use ifne to prevent the execution of the command if there is no input. Thanks, @acdw@tilde.zone! Summary ------- /etc/cron.daily/butlerian-jihad runs every hour and checks if there have been any abusive autonomous systems in the last two hours. If so, they are banned. Note how I've added my home IPv4 and IPv6 because I use my site a lot. 😅 #!/bin/sh bin/2h-access-log !^social !178.209.50.237 !MY-HOME-IPV4 !MY-HOME-IPV6 \ | awk '{print $2}' \ | bin/asncounter --no-prefixes 2>/dev/null \ | awk '/^[0-9]/ && $1>1000 { print $3 }' \ | ifne xargs bin/asn-networks \ | ifne xargs fail2ban-client set butlerian-jihad banip \ > /dev/null This drops the output (the number of new bans) because otherwise the cron job mails that number to me. 2h-access-log prints the last two hours worth of log lines from /var/log/apache2/access.log (and access.log.1 if necessary). The !^social argument ensures that connecting to my fedi server doesn't trigger the ban hammer. The !178.209.50.237 argument ensures that I don't ban the server itself as it monitors stuff and as I test things on the server. I also had to add my home IP numbers! asncounter finds the autonomous system numbers for all the IP numbers in the web server log file and prints a report. asn-networks then turns the selected autonomous system numbers and returns the IP ranges they manage. These are then banned by fail2ban-client using the butlerian-jihad jail. The butlerian-jihad jail is mentioned in enabled via a config file in /etc/fail2ban/jail.d/. In my case, the file is called alex.conf and for this jail, it says: [butlerian-jihad] enabled = true bantime = 1h The jail also needs a filter definition even though no filtering happens as no logfile is checked. My /etc/fail2ban/filter.d/butlerian-jihad.conf contains just this: # Author: Alex Schroeder [Definition] What this means is that every hour, an autonomous system unit can get banned for generating more than 500 hits in 2h. If they are banned, they are banned for 1h. If they are banned for activity in the last hour leading up to the ban (more than 500 hits in 1h), the script will find the same log entries and ban them "again". This results in no changes in the jail, since all the networks are already in the butlerian-jihad jail. The bans themselves are reported in /etc/log/fail2ban.log. I've also enabled the recidive jail. That is, in the same file where I defined my butlerian-jihad jail, I have: [recidive] enabled = true The defaults are in /etc/fail2ban/jail.conf: [recidive] logpath = /var/log/fail2ban.log banaction = %(banaction_allports)s bantime = 1w findtime = 1d So if some network is banned for more than five times in a day, it is banned for a week. I say five times because maxretry is set to 5 in /etc/fail2ban/jail.conf. Let's assume a scraper is started from some network managed by an autonomous system. It starts using IP numbers from all its ranges. It sends 400 requests per hour, more than a human could read and more than a feed reader should need, etc. * after the first hour, nothing happens, as 400 is less than the 500 needed to trigger the system * after the second hour, the ASN is banned because the sum total for the last two hours is 800 * after the third hour, the ASN is unbanned and not banned again because it only made 400 requests in the second hour * after the fourth hour, the ASN is banned again * after the fifth hour, the ASN is unbanned * after the sixth hour, the ASN is banned for the third time * after the seventh hour, the ASN is unbanned * after the eighth hour, the ASN is banned for the fourth time * after the ninth hour, the ASN is unbanned * after the tenth hour, the ASN is banned for the fifth time * after the eleventh hour, the ASN is unbanned * after the twelfth hour, the ASN is banned for the sixth time, the recidive filter kicks in and the networks belonging to the ASN are banned for a week This escalation takes twelve hours. The ASN was already banned for half this time. Assuming this repeats every week, it means that the pattern repeats every 7½ weeks and the abusive ASN still gets service on 6h out of 180h or 3% of the time. For my taste, that is still way too nice. Let's see how this goes for a while. Note: This doesn't actually work, as I discovered later. See a follow-up post for how to ban repeated offenders. I'm already looking forward to dropping my banlist and banlist6 sets I created for ban-cidr. Aftermath: cleaning up ban-cidr ----------------------- > "If you want to partake in unsupervised banning with no feedback, no > due process, just automatic ban-hammers, take a look at this script > full of firewall commands." -- 2025-01-23 The bots are at it again Back when I first encountered the distributed AI bot attacks, I wrote 2024-09-18 Emacs Wiki and China and began working on ban-cidr. In later posts, I just automated the work of getting from an IP number to a network range and adding that to the script. Now that I hopefully have an automatic solution where I only need to fine-tune the time-windows and the limits, it's time to expire all those bans. There are currently over 40,000 of these banned networks. ipset list banlist | tail -n +9 | wc -l 46920 ipset list banlist6 | tail -n +9 | wc -l 9 So slowly, over time, I'm planning to remove these. for ip in (ipset list banlist | tail -n +9 | head -n 1000); ipset del banlist $ip; end netfilter-persistent save Let's see how it goes! 😂 (Done: all of these blocks are now removed.) Petty banning ------------- Let's look at the output again: bin/2h-access-log !^social !178.209.50.237 | awk '{print $2}' | bin/asncounter --no-prefixes INFO: using datfile ipasn_20250616.1200.dat.gz INFO: collecting addresses from INFO: loading datfile /root/.cache/pyasn/ipasn_20250616.1200.dat.gz... INFO: finished reading data INFO: loading /root/.cache/pyasn/asnames.json count percent ASN AS 1041 8.44 45899 VNPT-AS-VN VNPT Corp, VN 929 7.54 13030 INIT7, CH 513 4.16 7922 COMCAST-7922, US 485 3.93 24940 HETZNER-AS, DE 351 2.85 14061 DIGITALOCEAN-ASN, US 304 2.47 7018 ATT-INTERNET4, US 296 2.4 16276 OVH, FR 293 2.38 45102 ALIBABA-CN-NET Alibaba US Technology Co., Ltd., CN 237 1.92 55836 RELIANCEJIO-IN Reliance Jio Infocomm Limited, IN 218 1.77 16509 AMAZON-02, US total: 12327 What is RELIANCEJIO-IN doing? bin/2h-access-log !^social | bin/asn-access-log 55836 | head | log-request /nobots /emacs?action=rc&from=1725473811&rcidonly=acidtoyman&showedit=1&upto=1726683411 /favicon.ico /nobots /emacs?action=rc&all=1&from=1728680271&rcidonly=Comments_on_hideif.el&showedit=1 /emacs?action=rss&all=1&days=14&full=1&rcidonly=mistilteinn.el&showedit=0 /nobots /emacs?action=rc&all=1&from=1728946699&rcidonly=ArneBab&showedit=1&upto=1729033099 /nobots /emacs?action=rc&all=0&from=1727470564&rcidonly=Comments_on_anything&showedit=1 Ugh. They need to go. First, create the two lists. At this level we need two different lists for IPv4 and IPv6. ipset create banlist hash:net iptables -I INPUT -m set --match-set banlist src -j DROP iptables -I FORWARD -m set --match-set banlist src -j DROP ipset create banlist6 hash:net family inet6 ip6tables -I INPUT -m set --match-set banlist6 src -j DROP ip6tables -I FORWARD -m set --match-set banlist6 src -j DROP Then define a new version of asn-ban: function asn-ban set --local data (asn-networks --json $argv) for asn in $argv for ip in (echo $data | jq --raw-output ".[\"$asn\"].v4[]") echo ipset add banlist $ip end for ip in (echo $data | jq --raw-output ".[\"$asn\"].v6[]") echo ipset add banlist6 $ip end end end This uses the --json option for asn-networks so that we only need to call it once and yet we get two lists: one for banlist and one for banlist6. To ban the offending ASN: asn-ban 55836|sh Before I do that, however, I want to finish clearing the existing lists. The reason I call this petty banning is because I'm starting to ban autonomous systems even though their bots are "well behaved" in as much as they don't exceed the thresholds I defined. And yet they seem to be part of that great parade to honour the CO₂ god, the computation of useless shit. 2025-06-20. I'm thinking about alternatives but I think that's not worth it. For example: Perhaps it's important to look at relative distribution? site-log transjovian | log-ip | asncounter --no-prefixes --top 3 2>/dev/null count percent ASN AS 7 36.84 132203 TENCENT-NET-AP-CN Tencent Building, Kejizhongyi Avenue, CN 3 15.79 216071 VDSINA, AE 3 15.79 396982 GOOGLE-CLOUD-PLATFORM, US total: 19 At first I thought, more than a third of all requests for Tencent? I must block them. But then I saw that it was just 7 requests in 2h. Not worth it. Here I saw that there were more requests, and an 80% share! site-log orientalisch | log-ip | asncounter --no-prefixes --top 3 2>/dev/null count percent ASN AS 172 81.52 45102 ALIBABA-CN-NET Alibaba US Technology Co., Ltd., CN 20 9.48 396982 GOOGLE-CLOUD-PLATFORM, US 10 4.74 132203 TENCENT-NET-AP-CN Tencent Building, Kejizhongyi Avenue, CN total: 211 But when you look at it, Alibaba is just fetching robots.txt all the time. I don't know who runs this bot. It's clearly a waste of CO₂. And yet… not worth it. site-log orientalisch | asn-access-log 45102 | log-request | rank-lines 172 /robots.txt So, I don't know. And if I don't care about the relative share of requests, then I also don't have to count them per site. I already wrote a little thing to give me a regular expression for every site I host! But now I'm not going to use it. I leave it here for you, dear reader. 😄 awk '/^MDomain/ {split($0, sites) result = sites[2] for (i = 3; i <= length(sites); i++) result = result "|" sites[i] print result }' /etc/apache2/sites-enabled/*.conf On the positive side, the current system seems to be working: awk '/butlerian-jihad/ { print $7, $8 }' /var/log/fail2ban.log | rank-lines 10 Ban 45.254.32.0/22 10 Ban 45.124.92.0/22 10 Ban 43.239.220.0/23 10 Ban 23.48.56.0/22 10 Ban 23.48.52.0/22 10 Ban 23.32.249.0/24 10 Ban 221.132.32.0/21 10 Ban 203.210.128.0/17 10 Ban 203.162.0.0/16 10 Ban 203.160.132.0/22 A bunch of networks were banned 10 times! Who do the Top 100 networks belong to? It's a Vietnamese autonomous system. awk '/butlerian-jihad/ { print $7, $8 }' /var/log/fail2ban.log \ | rank-lines -n 100 \ | awk '{split($3,parts,"/"); print parts[1]}' \ | xargs asn-find \ | awk '{print $2,$3}' \ | rank-lines 98 45899 VNPT-AS-VN 2 7643 VNPT-AS-VN OK! (You can find all the fish functions I use in the admin directory.) 2025-07-02. I was looking at the Traveller Subsector Generator and trying to figure out why memory wasn't being freed. I wondered whether somebody else was hitting this expensive end-point. grep "^campaignwiki.org.* /traveller" /var/log/apache2/access.log \ | grep -v "Monit" \ | leech-detector \ | head -n 4 Total hits: 125 IP | Hits | Bandw. | Rel. | Interv. | Status ----------------------------------------:|-----------:|-------:|-----:|--------:|------- 50.32.203.123 | 32 | 7K | 25% | 552.4s | 200 (40%), 404 (31%), 302 (28%) And who is that? asn 50.32.203.123 "5650 | 50.32.128.0/17 | US | arin | 2010-09-24" What else are they requesting? 2h-access-log !^social | asn-access-log 5650 | log-request | rank-lines 4 /nobots 1 /emacs?action=rss&all=1&days=14&rcidonly=EmacsListen&showedit=1 1 /emacs?action=rss&all=1&days=14&full=1&rcidonly=imapua&showedit=0 1 /emacs?action=rc&all=1&from=1728194306&rcidonly=Comments_on_emacsniftytricks&upto=1728799106 1 /emacs?action=rc&all=1&from=1728063015&rcidonly=d-insert-import.el&showedit=1 1 /emacs?action=rc&all=1&from=1726452954&rcidonly=%E3%82%AA%E3%83%AB%E3%82%B0%E3%83%A2%E3%83%BC%E3%83%89&upto=1727662554 1 /emacs?action=rc&all=1&from=1725759396&rcidonly=Comments_on_Registers&showedit=1&upto=1726968996 1 /cgi-bin/alex?action=rss&full=1 1 /c2-search?url=http%3A%2F%2Fwiki.c2.com%2F%3Fsearch%3D%22TheIdeal%22 Clearly, bots. The Apache configuration detects some suspicious requests and redirects these to /nobots. My thinking is: I could use this information to extrapolate! Who is getting the /nobots results? /root/bin/2h-access-log !^social \ | grep "GET /nobots" \ | awk '{print $2}' \ | /root/bin/asncounter --no-prefixes 2>/dev/null count percent ASN AS 67 4.23 7922 COMCAST-7922, US 66 4.17 45899 VNPT-AS-VN VNPT Corp, VN 53 3.35 7018 ATT-INTERNET4, US 50 3.16 55836 RELIANCEJIO-IN Reliance Jio Infocomm Limited, IN 37 2.34 28573 Claro NXT Telecomunicacoes Ltda, BR 24 1.52 21928 T-MOBILE-AS21928, US 23 1.45 24560 AIRTELBROADBAND-AS-AP Bharti Airtel Ltd., Telemedia Services, IN 20 1.26 20115 CHARTER-20115, US 17 1.07 5089 NTL, GB 17 1.07 701 UUNET, US total: 1584 So maybe that's the answer: More than 50 requests resulting in a /nobots response qualify the ASN for the butlerian-jihad jail! I have migrated /etc/cron.hourly/butlerian-jihad to /etc/cron.d/butlerian-jigad and it now has two jobs to run. SHELL=/usr/bin/fish # Don't forget to replace the !IP argument with your own server or home IPs or you'll end up banning yourself. # ban very active autonomous systems 1,16,31,46 * * * * root /root/bin/2h-access-log !^social !178.209.50.237 | awk '{print $2}' | /root/bin/asncounter --no-prefixes 2>/dev/null | awk '/^[0-9]/ && $1>500 { print $3 }' | ifne xargs /root/bin/asn-networks | ifne xargs fail2ban-client set butlerian-jihad banip > /dev/null # ban autonomous systems with a lot of bots based on Apache rules 5,20,35,50 * * * * root /root/bin/2h-access-log !^social !178.209.50.237 | grep "GET /nobots" | awk '{print $2}' | /root/bin/asncounter --no-prefixes 2>/dev/null | awk '/^[0-9]/ && $1>50 { print $3 }' | ifne xargs /root/bin/asn-networks | ifne xargs fail2ban-client set butlerian-jihad banip > /dev/null Too bad these cron jobs have to fit on a single line! 2025-07-08. Continued here.