2026-02-20 Producing garbage for the AI scrapers ================================================ I have joined the ranks of iocaine, nephentes and friends. It's called garbage. I use it for an Apache web server. There, some of my sites include a config file with a few checks for bots, bots looking for Wordpress manifests, admin dashboards and the like. These all get served a 410 "Gone" response and the error document for 410 is something like this No Bots. This is garbage running with a copy of Moby Dick acting as a Markov text generators. It looks at two words and predicts the next word, prints it, and repeats. Every word has a 2% chance of being a link to another generated page. I'm going to try this and switch off one of my watchers. Once I had added botcheck as reported a few days ago, I expected the number of blocked IP address ranges to drop significantly. But that was not the case. A Munin graph shows that the number of blocked IP address ranges is still around 20,000. I realized that most of my banning now happens due to the "no bots" watcher. ~# watch-recent-bans | cut -d ' ' -f 1-3,5- | sed 's/\[.*\]//' Feb 19 17:40:54 watch-nobots: 9 Feb 19 19:43:02 watch-nobots: 826 Feb 19 21:12:13 watch-active-autonomous-systems: 15 Feb 19 22:10:45 watch-active-autonomous-systems: 26 Feb 19 23:02:04 watch-nobots: 7 Feb 20 03:11:33 watch-nobots: 6652 Feb 20 05:04:27 watch-nobots: 6215 Feb 20 09:39:40 watch-nobots: 6221 Feb 20 10:12:18 watch-nobots: 39 Feb 20 10:40:51 watch-nobots: 54 Feb 20 11:41:39 watch-nobots: 7 Feb 20 15:10:42 watch-nobots: 4 Feb 20 16:21:37 watch-nobots: 4 Feb 20 17:51:40 watch-nobots: 4 Feb 20 18:41:58 watch-expensive-end-points: 31 Feb 20 20:00:43 watch-nobots: 31 Feb 20 21:12:03 watch-expensive-end-points: 31 So if I want to reduce the number of IP addresses getting banned, I need to handle the "no bots" issue some other way. I could just have served a regular 410 error document but I felt like trying another experiment. Does a Markov text generator eat a significant amount of resources? Let's see! #Butlerian_Jihad