Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I'm using a while loop to iterate over some text to find all the instances of a pattern match. When I do the substitution on the match, the process hangs (some how I've created an infinite loop?) I thought perhaps it was the "[]" characters in one of the variables that was causing problems on interpolation but escaping them didn't help. Is there a better way to substitute text on multiple matches when you need match specific substitutions?

while ($str =~ /<tag1[^>]*>\s*<tag2>\s*<tag3>([^<]*)<\/endtag3>/gsi) { ++$a; $b = $1; $b = $b . " [ $c-$a]"; $str =~ s/(<tag1[^>]*>\s*<tag2>\s*)<tag3>[^<]*<\/endtag3>/ +$1<tag3>$b<\/endtag3>/si; }

Replies are listed 'Best First'.
Re: while loop w/match is hanging
by Eliya (Vicar) on Jan 13, 2012 at 19:55 UTC

    Don't do that. It doesn't work, because the substitution resets the internal position pointer used with //g, so you'll always restart from the beginning.  (And even if it didn't, it still likely wouldn't work, if you modify the string such that the remembered start position for the next match is no longer correct... — you could in theory adjust it by assigning to pos($str) = ..., but why make things more complex than they need to be?)

    Simply do a global substitution without prior matching, and put the computations in a separate sub. Something like this:

    sub subst_str { my $match = shift; my $b = ... return "$match<tag3>$b</endtag3>"; } .. $str =~ s/(<tag1[^>]*>\s*<tag2>\s*)<tag3>[^<]*<\/endtag3>/subst_str($1 +)/gsie; # note the /e
Re: while loop w/match is hanging
by kennethk (Abbot) on Jan 13, 2012 at 20:05 UTC
    Thank you for posting code, but without input, replicating your problem is more of a challenge than it should be. See How do I post a question effectively?.

    The problem is that your iterator gets reset every time you change your string, so you end up accumulating an infinite number of your modifications in your first match. A much cleaner way of accomplishing this would be to use the e modified -- see Search and replace in perlretut. You can then get it all done in one clean shot:

    my $counter = 0; $str =~ s/<tag1[^>]*>\s*<tag2>\s*<tag3>[^<]*\K(?=<\/endtag3>)/sprintf +" [ $c-%d]", ++$counter/egsi;

    I've used the \K control character (Character Classes and other Special Escapes) and a look-ahead (Looking ahead and looking behind) to save from having to put back unnecessarily removed material. It also makes the result much clearer, I think.

    As a side note, you should not use $a and $b as normal variables, since they have special meaning in a sorting context; see $a. It's usually considered poor form to use single character variable names, except in some very specific, common scenarios, e.g. $i, $j... for counters and $x, $y... for coordinates.

Re: while loop w/match is hanging
by CountZero (Bishop) on Jan 13, 2012 at 19:51 UTC
    If you show some of the data that exhibit this behaviour we can test it.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James