2025-07-03 fail2ban some more ============================= This is a continuation of 2025-06-16 Ban autonomous systems. I kept wondering why the "recidive" jail never found any repeated offenders from the "butlerian-jihad" jail. I think I know why, now. The "recidive" jail uses the following: failregex = ^%(__prefix_line)s(?:\s*fail2ban\.actions\s*%(__pid_re)s?:\s+)?NOTICE\s+\[(?!%(_jailname)s\])(?:.*)\]\s+Ban\s+\s*$ Far to the right, it uses HOST and that only matches a single IP number. If you examine the regular expression generated and scroll over far enough to the right, you'll see the named groups and . # fail2ban-client get recidive failregex The following regular expression are defined: `- [0]: ^(?:\[\])?\s*(?:<[^.]+\.[^.]+>\s+)?(?:\S+\s+)?(?:kernel:\s?\[ *\d+\.\d+\]:?\s+)?(?:@vserver_\S+\s+)?(?:(?:(?:\[\d+\])?:\s+[\[\(]?(?:fail2ban(?:-server|\.actions)\s*)(?:\(\S+\))?[\]\)]?:?|[\[\(]?(?:fail2ban(?:-server|\.actions)\s*)(?:\(\S+\))?[\]\)]?:?(?:\[\d+\])?:?)\s+)?(?:\[ID \d+ \S+\]\s+)?(?:\s*fail2ban\.actions\s*(?:\[\d+\])?:\s+)?NOTICE\s+\[(?!recidive\])(?:.*)\]\s+Ban\s+(?:\[?(?:(?:::f{4,6}:)?(?P(?:\d{1,3}\.){3}\d{1,3})|(?P(?:[0-9a-fA-F]{1,4}::?|::){1,7}(?:[0-9a-fA-F]{1,4}|(?<=:):)))\]?|(?P[\w\-.^_]*\w))\s*$ I decided to create an additional jail. In my own /etc/fail2ban/jail.d/alex.conf I added a second jail: [butlerian-jihad] enabled = true bantime = 1h [butlerian-jihad-week] logpath = /var/log/fail2ban.log enabled = true findtime = 1d bantime = 1w maxretry = 5 The first one uses the filter /etc/fail2ban/filter.d/butlerian-jihad.conf which remains empty. Remember, entries are added to this jail via a cron job discussed in an earlier post. [Definition] The second one uses a new filter /etc/fail2ban/filter.d/butlerian-jihad-week.conf defining the date pattern and the regular expression to detect "failures" (i.e. a hit). [Init] # 2025-06-29 01:17:08,887 fail2ban.actions [543]: NOTICE [butlerian-jihad] Ban 1.12.0.0/14 datepattern = ^%%Y-%%m-%%d %%H:%%M:%%S [Definition] failregex = NOTICE\s+\[butlerian-jihad\] Ban The important part is that this uses instead of . If you scroll over to the right, you'll find a new group: # fail2ban-client get butlerian-jihad-week failregex The following regular expression are defined: `- [0]: NOTICE\s+\[butlerian-jihad\] Ban \[?(?:(?:::f{4,6}:)?(?P(?:\d{1,3}\.){3}\d{1,3})|(?P(?:[0-9a-fA-F]{1,4}::?|::){1,7}(?:[0-9a-fA-F]{1,4}|(?<=:):)))(?:/(?P\d+))?\]? And it seems to be working. The Munin graph shows how the butlerian-jihad-week jail immediately jumps to 3000 members I had to restart this particular jail a few times. Using --unban makes sense because those deserving of a new ban will be discovered immediately as the findtime was set to one day up above. fail2ban-client restart --unban butlerian-jihad-week #Administration #Butlerian_Jihad #fail2ban 2025-07-05. Two days later. 2025-07-06. Hm. I made a change to Emacs Wiki search, hoping to get rid of the DuckDuckGo dependency: * I made the page title match much more prominent * I switched the search from GET to POST * I count the search via GET as a bot (since it's no longer doable via the user interface) * I reinstated the old full-text search (essentially a grep within Perl) I was hoping that it would have very little effect. At about the same time, however, load started creeping up. The question is whether this is caused by so many search requests or not. There aren't many search requests in the logs, and the process monitors don't show unusually activity for the Emacs Wiki processes. Therefore, I think the answer is that the problem lies elsewhere. But where? Somewhere around the 3rd of July load minimum seems to raise up from 0.5 to 1.0 This virtual server has two cores so load should remain below 2.0, ideally. Somewhere around the 3rd of July the number of hosts banned for a week goes up from 2000 to more than 7000 Is it the processing of all the bans? I don't think so, since the firewall had many thousands of banned networks before. Is it the extra cron jobs monitoring the logs? I don't think so because there's no 15min or 20min periodicity to see. And note how load does come back down to 0.5 for a very short moment around midnight from the 4th to the 5th and in the early morning hours of the 6th. How strange. 2025-07-07. Maybe just a fluke. I mean, if these defences actually worked the way I'd want them to, then an actual attack would feel like a fluke, right? 😄 The load graph shows that the current value is 0.5 although the average is still 1.6. Also of note: The number of banned-for-a-week IP numbers and networks is up to 7900. 2025-07-08. And just now I found out the hard way that things weren't working as well as they ought to. Around 18:00 Munin just stops working. Load was over 140 when I checked in. I had over 80 processes attempting to serve Community Wiki requests. In the last two hours, I had 6629 requests and 3939 of them were for dynamically generated Recent Changes and RSS feeds. For example: # 2h-access-log ^community | egrep 'action=(rss|rc)' | log-request | head /wiki?action=rss&all=0&days=3&full=1&rcfilteronly=%22WebDavVsFtp%22&showedit=1 /wiki?action=rc&all=1&from=1749381715&rcidonly=SoftwareBazzar&showedit=1&upto=1750591315 /wiki?action=rc&all=0&days=28&rcfilteronly=%22Chalks%22&showedit=1 /wiki?action=rc&all=1&days=1&rcidonly=2003-10-25&showedit=1 /wiki?action=rc&all=1&days=1&rcidonly=DatabaseAdministrator&showedit=0 /wiki?action=rss&all=0&days=21&diff=1&full=1&rcidonly=RecentChangesBookmarklet&showedit=1 /wiki?action=rss&all=0&days=1&full=1&rcidonly=HeatherJames&showedit=1 /wiki?action=rss&all=1&days=21&diff=1&full=1&rcidonly=WebDavServer&showedit=1 /wiki?action=rc&all=0&from=1749653709&rcidonly=UIJWCzCwxpaKVV&showedit=1 /wiki?action=rc&all=0&days=14&rcidonly=TranslationProject&showedit=1 Once I had confirmed that the victim was /home/alex/communitywiki2.pl , I killed them all: for pid in (ps aux|grep communitywiki2|awk '{print $2}'); echo $pid; kill -9 $pid; end Also stopped respawns: monit stop communitywiki All right, time to launch some scripts. 1180 new entries, banned! Another 553 banned. And 1305 more. Hm, strange. 🤔 Why aren't they all banned in one go? Ah! I think I see: asncounter only prints the top 10 autonomous systems by default! So I'm going to add a new line to my /etc/cron.d/butlerian-jihad, all on one line, with appropriate time expressions, excluding my own IP numbers, just in case, and so on. You know the drill. # watch other expensive end-points /root/bin/2h-access-log !^social \ | egrep 'action=(rss|rc)\&' \ | awk '{print $2}' \ | /root/bin/asncounter --no-prefixes --top 50 2>/dev/null \ | awk '/^[0-9]/ && $1>30 { print $3 }' \ | ifne xargs /root/bin/asn-networks \ | ifne xargs fail2ban-client set butlerian-jihad banip > /dev/null Now if only Munin would start graphing again. Looking at /var/log/munin/munin-update.log I guess I need to rm /var/run/munin/munin-update.lock. Let's see if that helps. 😄 Nearly 3000 entries added to the short-term butlerian-jihad jail (1 hour ban). Sadly, load started climbing again. 40. 50. In total, 44 processes were trying to serve Community Wiki. The banning by autonomous system doesn't seem all that efficient any more. Looking at the last 20 suspicious entries for Community Wiki and seeing that each one is from a different autonomous system. # 2h-access-log ^community \ | egrep '\baction=(rss|rc)\&|\bsearch=' \ | tail -n 20 \ | awk '{print $2}' \ | /root/bin/asncounter --no-prefixes --top 20 2>/dev/null count percent ASN AS 1 5.0 139604 ARROWNET-AS-AP Arrow Net, BD 1 5.0 270878 SPEEDNET FIBRA, BR 1 5.0 22646 HARCOM1, US 1 5.0 22773 ASN-CXA-ALL-CCI-22773-RDC, US 1 5.0 28669 America-NET Ltda., BR 1 5.0 198589 JT-AS, IQ 1 5.0 18881 TELEFONICA BRASIL S.A, BR 1 5.0 43766 MTC-KSA-AS, SA 1 5.0 35753 ITC ITC AS number, SA 1 5.0 212238 CDNEXT, GB 1 5.0 152637 COMILLA4-AS-AP Comilla Cable TV Online, BD 1 5.0 27924 AMPLIA COMMUNICATIONS LTD., TT 1 5.0 56465 THERECOMLTD, UA 1 5.0 5089 NTL, GB 1 5.0 53006 ALGAR TELECOM SA, BR 1 5.0 264932 STAYNET SERVICOS DE INTERNET LTDA, BR 1 5.0 262700 VERO S.A, BR 1 5.0 62240 CLOUVIDER Clouvider - Global ASN, GB 1 5.0 7552 VIETEL-AS-AP Viettel Group, VN 1 5.0 19635 SANDHILL-AS, US total: 20 I've decided to lower the limit from 30 down to 10 expensive requests per ASN! 🫣 And with that, 6922 networks are now banned. 2025-07-09. As I was trying to start my netnews client (tin), I got a message saying that it wouldn't connect to the server as load was too high (over 17). Wow! Now here's a client that respects the server's needs! I lowered my limit from 10 to 5 and manually ran my command without waiting for the cron job: 2h-access-log !^social $MY_IP_NUMBERS \ | egrep '\baction=(rss|rc)\&|\bsearch=' \ | awk '{print $2}' \ | /root/bin/asncounter --top 50 --no-prefixes 2>/dev/null \ | awk '/^[0-9]/ && $1>5 { print $3 }' \ | ifne xargs /root/bin/asn-networks \ | ifne xargs fail2ban-client set butlerian-jihad banip Ran it a while ago: 2187 banned. Ran it again just now: 430 banned. The distribution was very international. My limit goes against that first number, the count. count percent ASN AS 10 1.41 8075 MICROSOFT-CORP-MSN-AS-BLOCK, US 9 1.27 26599 TELEFONICA BRASIL S.A, BR 9 1.27 8193 BRM-AS, UZ 8 1.13 9299 IPG-AS-AP Philippine Long Distance Telephone Company, PH 8 1.13 7552 VIETEL-AS-AP Viettel Group, VN 7 0.99 60653 FEEDLY-DEVHD, US 7 0.99 396982 GOOGLE-CLOUD-PLATFORM, US 6 0.85 45475 AMURINET-NZ Amuri Net, NZ 6 0.85 203214 HULUMTELE, IQ 6 0.85 5650 FRONTIER-FRTR, US 6 0.85 45609 BHARTI-MOBILITY-AS-AP Bharti Airtel Ltd. AS for GPRS Service, IN 6 0.85 6939 HURRICANE, US 5 0.7 28210 GIGA MAIS FIBRA TELECOMUNICACOES S.A., BR 5 0.7 18881 TELEFONICA BRASIL S.A, BR 5 0.7 262773 PROXXIMA TELECOMUNICACOES SA, BR 5 0.7 11427 TWC-11427-TEXAS, US 5 0.7 28126 BRISANET SERVICOS DE TELECOMUNICACOES S.A, BR 5 0.7 10796 TWC-10796-MIDWEST, US 4 0.56 199739 EARTHLINK-DMCC-IQ, AE 4 0.56 6167 CELLCO-PART, US 4 0.56 36903 MT-MPLS, MA 4 0.56 27882 Telefonica Celular de Bolivia S.A., BO 4 0.56 6057 Administracion Nacional de Telecomunicaciones, UY 4 0.56 15557 LDCOMNET --- I3Dnet ---, FR 4 0.56 25019 SAUDINETSTC-AS, SA 4 0.56 8452 TE-AS TE-AS, EG 4 0.56 22927 Telefonica de Argentina, AR 4 0.56 133661 NETPLUS-AS Netplus Broadband Services Private Limited, IN 4 0.56 17639 CONVERGE-AS Converge ICT Solutions Inc., PH 3 0.42 17072 TOTAL PLAY TELECOMUNICACIONES SA DE CV, MX 3 0.42 39891 ALJAWWALSTC-AS, SA 3 0.42 9038 BAT-AS9038, JO 3 0.42 28649 Desktop Sigmanet Comunicacao Multimidia SA, BR 3 0.42 212238 CDNEXT, GB 3 0.42 53006 ALGAR TELECOM SA, BR 3 0.42 43766 MTC-KSA-AS, SA 3 0.42 206206 KNET, IQ 3 0.42 22773 ASN-CXA-ALL-CCI-22773-RDC, US 3 0.42 9541 CYBERNET-AP Cyber Internet Services Pvt Ltd., PK 3 0.42 52613 GIGA MAIS FIBRA TELECOMUNICACOES S.A. VIP, BR 3 0.42 216071 VDSINA, AE 2 0.28 5713 SAIX-NET, ZA 2 0.28 5416 Internet Service Provider, BH 2 0.28 28343 UNIFIQUE TELECOMUNICACOES SA, BR 2 0.28 12389 ROSTELECOM-AS PJSC Rostelecom. Technical Team, RU 2 0.28 3462 HINET Data Communication Business Group, TW 2 0.28 7713 TELKOMNET-AS-AP PT Telekomunikasi Indonesia, ID 2 0.28 21299 KAR-TEL-AS Almaty, Republic of Kazakhstan, KZ 2 0.28 5384 EMIRATES-INTERNET Emirates Internet, AE 2 0.28 36884 MAROCCONNECT, MA total: 710 At least we can all agree that it's no longer just Emacs Wiki and China! Remember 2024-11-25 Emacs Wiki and it's still China and 2024-11-25 Emacs Wiki and it's still China. Now it's Community Wiki and the USA, Brazil, Uzbekistan, the Philippines, Vietnam, New Zealand, Iraq, India, the Arab Emirates, Morocco, Bolivia, Uruguay, France, Saudi Arabia, Egypt, Argentina, Mexico, Jordan, Great Britain, Pakistan, South Africa, Bahrain, Russia, Taiwan, Indonesia, Kazakhstan. With apologies to Mercutio: A plague on all your houses! 2025-07-11. Here's something that confuses me: CPU is around 30% and yet load average is at 10. A screenshot of htop shows that CPU is slightly over 30% but load average is over 10. I added the following to my /etc/apache/conf-enabled/blocklist.conf: # Temporary block for elaborate recent changes RewriteCond "%{QUERY_STRING}" "action=(rc|rss)\&" RewriteCond "%{HTTP_HOST}" "communitywiki" # (redirect to /nobots means fail2ban is watching) RewriteRule "^" https://communitywiki.org/nobots [redirect=410,last] I'm really starting to think that I have to rewrite my applications because of these AI scrapers. One more example of how they are costing society. The wiki that has been working fine since 2003 would need to protect expensive end-points behind POST requests even though most of them do not involve "posting" any edits: * no backlink search by clicking on headers * no filtered Recent Changes unless via a form (i.e. the POST method) * no filtered RSS feeds (as those always use the GET method)