in reply to Re: Need to speed up many regex substitutions and somehow make them a here-doc list
in thread Need to speed up many regex substitutions and somehow make them a here-doc list

> This approach assumes each text file can be slurped to memory; 2-100 MB should be no problem

The OP could slice the input into big chunks separated at newline boundaries.

If that's not possible he could alternatively use a sliding window which always continues at the pos where the last replacement ended.

On a side note, your map qr{...} join ... irritated me a bit, because the processed list has only one element. Not sure if that's the clearest style.

Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery

  • Comment on Re^2: Need to speed up many regex substitutions and somehow make them a here-doc list
  • Download Code

Replies are listed 'Best First'.
Re^3: Need to speed up many regex substitutions and somehow make them a here-doc list
by AnomalousMonk (Archbishop) on Oct 02, 2022 at 20:02 UTC
    ... your map qr{...} join ... irritated me a bit, because the processed list has only one element.

    Yeah, that gets to me a bit too, whenever I use it. But that syntax is used in haukex's original article, so I'm willing to consider it an "idiom." :)

    The important point is that the regex elements be somehow converted into a regex object. It's at this stage that any necessary boundary assertions are added. The only reasonable alternative I can see is something like

    my $rx_search = join ' | ', map quotemeta, reverse sort keys %replace ; $rx_search = qr{ ... $rx_search ... }xms;
    That's slightly more irritating to me and doesn't seem to clarify anything either.


    Give a man a fish:  <%-{-{-{-<

      > $rx_search = qr{ ... $rx_search ... }xms;

      Ok it's somehow "wasting" a variable, but

      my $rx_search = qr{$joined_search}xms;

      wouldn't really irritate me.

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery