2025-07-25 Oddmuse and the Butlerian Jihad ========================================== So what about that old wiki software that's been serving me so well for more than twenty years? Back then, Recent Changes or a search for a page title was implemented as a GET request. The idea was that it would be possible to bookmark or share such links. But what has happened instead is that the web scrapers are losing themselves in a gazillion dynamic pages, trying to ingest them. And since searches and filters are expensive operations, this drives up the load of the system hosting the wiki. Examples such as these showed up in previous blog posts: # 2h-access-log !^social | asn-access-log 7713 | log-request | rank-lines 1 /wiki/Older_Upgrading_Issues 1 /wiki/CategoryWiki 1 /wiki?action=rss&all=1&days=1&full=1&rcidonly=wiki_feeds&showedit=0 1 /wiki?action=rss&all=0&days=7&diff=1&full=1&rcidonly=CommentHabillerUnFilRss&showedit=1 1 /wiki?action=rss&all=0&days=28&rcidonly=2004-07-12&showedit=0 1 /wiki?action=rc&from=1749992400&rcidonly=GermanXpCommunity&showedit=1&upto=1750597200 1 /wiki?action=rc&all=1&from=1750742594&rcidonly=WikiToHTML&upto=1751001794 1 /wiki?action=rc&all=0&days=28&rcfilteronly=%22DifficultPerson%22&showedit=0 1 /wiki?action=rc&all=0&days=14&rcidonly=HoofSmith&rollback=1&showedit=0 1 /wiki?action=admin&id=UserInterfaceValidator In theory, POST is used when making changes to a web resource such as a wiki page. What I've done now is I wrote an extension that changes all these links to forms and I've installed it for the Oddmuse wikis I still run. At the same time, the web server is blocking requests to these URLs. # Block for GET requests for search, recent changes and filtered feeds RewriteCond "%{QUERY_STRING}" "search=" [or] RewriteCond "%{QUERY_STRING}" "action=rc" [or] RewriteCond "%{QUERY_STRING}" "action=rss[&;]" RewriteRule "^" https://alexschroeder.ch/nobots [redirect=410,last] Let's see if that starts cutting down on the number of these requests I'm getting. I suspect that many of these URLs are in fact stored in training sets so it will take a long time for these URLs to fade from use. #Butlerian_Jihad #Oddmuse #Apache