http://qs1969.pair.com?node_id=718936

brycen has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks, Is there a nice clean way to do a replace after match, sort of like this nice clean but non-working code:
#!/usr/bin/perl -w use strict; my %fixers=( 'amazon.com' => 'danube.com', 'ibm.com' => 'bm.com', ); my $text = qq( Blah <a href="http://amazon.com/foo">one</a> <a href="http://ebay.com/foo">two</a> ); sub tweak_links($) { my $text_ref = shift; while( $$text_ref =~ m|"http://(.*?)["/]|g ) { if( $fixers{$1} ) { $1 = $main::fixers{$1}; } } } tweak_links(\$text);

I'm looking for something that's good when most of the matches never happen, but I can zero in on the right section with a match. In my case I even know which of the replacements is most likely on a particular text block (but all of them could happen in each block.

Is there a usual way this is done?

Keywords: perl regex replacement regular expression table-driven match multiple replacements

Replies are listed 'Best First'.
Re: Replace after match in regex (key value subsitution)
by lamp (Chaplain) on Oct 23, 2008 at 04:08 UTC
    The following eg. code will substitute the '%fixers' hash key with corresponding value from '$text'.
    #!/usr/bin/perl -w use strict; my %fixers=( 'amazon.com' => 'danube.com', 'ibm.com' => 'bm.com', ); my $text = qq( Blah <a href="http://amazon.com/foo">one</a> <a href="http://ebay.com/foo">two</a> ); map { $text =~ s/$_/$fixers{$_}/; }keys %fixers; print $text;
      Almost.
      • You forgot to convert the text to a regexp pattern.
      • You only replace the first instance.
      • And since I'm already changing the line, I'll remove the useless use of map.
      $text =~ s/\Q$_\E/$fixers{$_}/g for keys %fixers;

      It could still be improved if it's going to be done repeatedly.

      my ($re) = map qr/$_/, join '|', map quotemeta, keys %fixers; while (...) { ... $text =~ s/$re/$fixers{$_}/g for keys %fixers; ... }

      Or even better for pre-5.10

      use List::Regexp qw( ); my $re = List::Regexp->new()->list2re( keys %fixers ); while (...) { ... $text =~ s/$re/$fixers{$_}/g for keys %fixers; ... }
Re: Replace after match in regex (key value subsitution)
by gone2015 (Deacon) on Oct 23, 2008 at 09:51 UTC

    How about:

    $$text_ref =~ s|(?<="http://)(.*?)(?=["/])|$fixers{$1} ? $fixers{$1} + : $1|ge ;

    which does one scan of the input. This may be an advantage if you have a lot of fixers. But, if there are a lot of things that match, but don't require fixing, doesn't work so well.

    the more complicated:

    my $what = join('|', map(quotemeta, keys %fixers)) ; $$text_ref =~ s!(?<="http://)($what)(?=["/])!$fixers{$1}!ge ;
    also does just the one scan and hits only the fixers.

Re: Replace after match in regex (key value subsitution)
by JadeNB (Chaplain) on Oct 23, 2008 at 18:47 UTC
    Although others have pointed out that there is a nice one-pass approach to your problem, it is possible to access the location of a match after the match itself:
    my $a = 'abc'; $a =~ /(b)/; substr($a, $-[1], $+[1] - $-[1], 'B'); print $a, "\n"; # aBc
    See perlvar.
Re: Replace after match in regex (key value subsitution)
by JavaFan (Canon) on Oct 23, 2008 at 09:53 UTC
    I'd write that as (untested):
    $$text_ref =~ s{"http://\K([^/"]*)(?=["/]){$main::fixers{$1} // $1}eg;
    I assume non of the values of %main::fixers is undefined.
Re: Replace after match in regex (key value subsitution)
by brycen (Monk) on Oct 24, 2008 at 23:14 UTC
    Ok, I warmed up to this solution, thanks oshalla!
    # pre build a pattern "foo\.edu|fum\.edu" my $what = join('|', map(quotemeta, keys %main::fixers)) ; sub tweak_links($) { my $text_ref = shift; return($$text_ref =~ s!(?<="http://)($what)(?=["/])!$main::fixers{$1}!ge); }
    The overall script runtime dropped in half (0m4.8s vs. 0m7.2s on a test directory).
Re: Replace after match in regex (key value subsitution)
by brycen (Monk) on Oct 24, 2008 at 22:47 UTC
    I ended up going with this, because, well, clarity of code won over speed?
    sub tweak_links($) { my $text_ref = shift; my $modified = 0; while (my($key,$value) = each(%main::fixers)) { if($$text_ref =~ s|"(http://)$key(["/])|"$1$value$2|g) { #" $modified = 1; } } return($modified); }
    I'd prefer a pre-compiled regex, if the resulting code was not obtuse.