in reply to Re: Perlmonks site has become far too slow
in thread Perlmonks site has become far too slow

Yes, it has been particularly bad this week and as has been discussed in this and other similar threads it is driving genuine users away as a result. If there were an easy fix it would have been applied.

If you want to protect your favourite websites the solution is not to use any LLM-trained products or services whatsoever and to encourage everyone you know to take the same stance. We all need to cut off the revenue streams for these pariahs as it is the only thing which matters to them.


🦛

  • Comment on Re^2: Perlmonks site has become far too slow

Replies are listed 'Best First'.
Re^3: Perlmonks site has become far too slow
by syphilis (Archbishop) on Aug 29, 2025 at 13:58 UTC
    If there were an easy fix it would have been applied.

    The thing that puzzles me is that, of all the websites that I regularly peruse, perlmonks stands out (and I mean really stands out) as clearly being the slowest and flakiest.
    Why is that ? What is the "feature" of perlmonks that makes it so extremely susceptible to these attacks ?

    Cheers,
    Rob

      Every page is several database accesses and several Perl eval calls. See DBIx::VersionedSubs for something like it, but not used on Perlmonks itself.

      I think converting to static files for (say) SoPW nodes for Anonymous Monk might reduce the load so that the site remains accessible for the human users. I'm thinking of measuring the impact of a -f call for every page load. Of course, using static files means that the nodelets are either stale or need to be removed from Anonymous Monks view of the site.

      Caching is of no use, since the bots are basically hitting all URLs with equal randomness, so there is no set of "hot" nodes.

        I think converting to static files for (say) SoPW nodes for Anonymous +Monk might reduce the load so that the site remains accessible for th +e human users.

        or will encourage even harder attacks. They definetely have the resources and will occupy the extra bandwitdh, space anyway, when they discover some was created.

        I dont understand why not blocking an ip as soon as anonymous hits lots of pages in quick succession especially old ones which have not been visited for a while. We had similar discussions before (Re^3: Unable to connect), interesting clues came about by jdporter's investgations. Now a different angle is discussed. Are we going in circles?

        The site is so unusable it discourages me to spontainously post. I dont have the luxury to save a failed post, provided i find it, for future posting at the time the wicked bots got a minute of rest (bots have a cup of tea).

        An idea would be to contact the AI bots and sell the data directly to them. Can't have more static than what's inside a usb.

        edit: last time they tried to pacify the beast, it swallowed Czechoslovakia.

        24h edit: we can have a live site where all those interactive users go to vote/post/admin and have the static site updtated from the live site daily for anon, search and bots. The live site for logged in users will be just as this now but in a secret www._perlmonks.org address. then every night the static site is updated with the daily deltas from the live site. That will keep the monks shetered from the evil bots which will ravish everyone else outside the monastery, "that's a shame!". Relevant soundtrack while our sitedev clan fights the evil bots: Yoshimi fights the pink robots by Flaming Lips (youpuke warning, ot: i use newpipe app from f-droid to view yt content on mobi). I saved this edit 12h ago when site was unusable and posting it now when site seems ti be very responsive.

        bw, bliako

        How about:
        RewriteEngine On # Match requests like /?node_id=12345 RewriteCond %{REQUEST_URI} ^/$ RewriteCond %{QUERY_STRING} ^node_id=([0-9]+)$ # Skip if the cookie header contains userpass= RewriteCond %{HTTP_COOKIE} !(^|;\s*)userpass= # Serve cached file if it exists RewriteCond %{DOCUMENT_ROOT}/cache/%1.html -f RewriteRule ^$ /cache/%1.html [L]

        Then any time anonymous requests a page, save a copy of what you serve to /cache/$node_id.html, and every time someone posts/edits content under a node, call unlink("cache/$node_id.html");

        Apache should be able to crank these out way faster than CGI could.

        For bonus points, store the cache in ZFS with compression enabled. Maybe also minify the HTML before saving it.

        Some remarks/ideas for the logs:

        > Every page is several database accesses

        getNodeByID in Everything/NodeBase.pm is using a cache when accessing the DB.

        472: sub getNodeById 473: { ... 488: # See if we have this node cached already 489: $cachedNode = $this->{cache}->getCachedNodeById($N); 490: return $cachedNode unless ($selectop eq 'force' or not $cach +edNode);

        see also Everything/NodeCache.pm

        I suppose the gods have increased the cache-size from the initial 300 in Everything/NodeBase.pm?

        095: $db->{cache} = new Everything::NodeCache($this, 300);
        Does the caching prioritize based on access count, I have to admit this is not easy to grasp.

        > so there is no set of "hot" nodes.

        there is no set of hot posts (which are internally nodes) but specific code and html nodes certainly are heavily used internally (AFAICS is 99,9% of the monastery held in DB-nodes)

        > and several Perl eval calls

        using memoization of in Everything/HTML.pm might help here to avoid unnecessary compilations

        968: sub evalCode { 969: my( $code )= shift @_; 970: my( $CURRENTNODE )= shift @_; 971: # Note! @_ is left set to remaining arguments! ... 985: my $str = eval $code; ...

        (tho there might be a side effect of pre-compiling into an extra sub layer)

        something like (untested)

        my $sub = $evalcode_cache{$str} //= eval " sub { $str }"; my $str = $sub->(@_);

        html caching

        an internal caching of the result of std_node_display into the DB might help too, but here plenty of side parameters need to be taken into consideration.

        A caching for Anomonk alone must take into consideration (at least)

        • if the sub-tree of replies has changed
        • if the content of the post or any reply has changed by update
        • if the down-votes for a reply has reached the so called "crap-level" to be hidden
        A pragmatic solution would be to not list the content of all replies for Anomonk, just the links to the direct replies.

        The "print view w/o replies" is already close, but doesn't include links to children replies yet.

        compare https://perlmonks.org/?node_id=11164875;displaytype=print

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        see Wikisyntax for the Monastery

        > I'm thinking of measuring the impact of a -f call for every page load.

        I'm not sure what a -f call means ... (?)

        > I think converting to static files for (say) SoPW nodes for Anonymous Monk might reduce the load so that the site remains accessible for the human users.

        This is a brainstorm:

        From what I see is anonymous monk's only dynamic content for > 99% of the nodes is in the nodelets ( Chatterbox, Other Users, and what else?) and those should be of low priority for AnoMonk.

        A frontend could check if the user is logged in and if the node-id is lower than the last caching and/or deliver a static file if present.

        If the caching to a static file is done ...

        • once per day as a bulk operation
        • or if file is missing
        • or on every write (like updates)
        ... is a matter of debate.

        However, not sure how best to deal with named nodes.

        I can't tell if the caching should best be done in the file system or in a DB.

        And if the "frontend" could be in realized in a web-server rule or a patch in Everything.pm

        (I'm sure I've not covered all edge cases, but wanted at least have it written them down for future reference :)

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        see Wikisyntax for the Monastery