in reply to Re: Matching text between tags
in thread Matching text between tags

Why this code can't get the second match:
#!/usr/bin/perl use strict; use warnings; my @lines = ("Aliquam vitae ipsum id felis finibus congue. Ut molesti +e scelerisque purus, sit amet rhoncus leo aliquet ac. In eu lobortis quam. Maecenas auctor + semper enim, ut convallis sapien dictum eu. Sed arcu ex, ornare et porttitor vitae +, interdum a mi. Mauris rutrum luctus rhoncus. Quisque velit quam, convallis vel est a +t, tincidunt accumsan velit. Fusce ut <u>metus ut which may either exceed \$1,000.00 or OK. G. LAT +, semper nunc, in dictum magna.</u> Aliquam ac vestibulum dolor. Praesent in magna nisi. Cras nec viverra + ligula. Suspendisse efficitur imperdiet eros, <u>XXsed rhoncus sapien euismod cursus. Ves +tibulum a posuereYY</u> elit, eget tristique eros. Etiam et lectus venenatis, aliquet dui vitae, po +suere lectus."); #while (defined( my $lines = shift @lines)){ #foreach my $lines (@lines){ for my $lines (@lines){ if( $lines =~ /<u>(.*?)<\/u>/sg ){ print "\n $1\n"; } }

Replies are listed 'Best First'.
Re^3: Matching text between tags
by AnomalousMonk (Archbishop) on Apr 14, 2016 at 16:58 UTC
    Why this code can't get the second match ...

    Because the  /g modifier in scalar context (which is supplied by evaluating the match as part of an  if or  while condition expression) will cause an  m//g match to match only once per evaluation. The  if block only executes once if the condition is true. The  while block continues to execute until the conditional is no longer true.

    c:\@Work\Perl\monks>perl -wMstrict -le "my $s = qq{foo <u> match \n the first </u> bar <u> second \n match </ +u> baz}; print qq{[[$s]] \n}; ;; if ($s =~ m{ <u> (.*?) </u> }xmsg) { print qq{if: '$1'}; } print ''; ;; pos $s = 0; while ($s =~ m{ <u> (.*?) </u> }xmsg) { print qq{while: '$1'}; } " [[foo <u> match the first </u> bar <u> second match </u> baz]] if: ' match the first ' while: ' match the first ' while: ' second match '

    Note: The  pos $s = 0; statement is needed in the example because each string keeps track of its own match position, and that match position is used in  /g matching. Try eliminating the statement from the code and see what happens. Also try printing pos at various strategic points in execution.

    Update: Also look in Regexp Quote-Like Operators in perlop for discussion of  m/PATTERN/msixpodualgc and look for the phrase 'In scalar context, each execution of "m//g" ...'     (Update: See also Global matching in perlretut.)


    Give a man a fish:  <%-{-{-{-<

Re^3: Matching text between tags
by LanX (Saint) on Apr 14, 2016 at 16:55 UTC
      I didn't cause the text comes in an array format, getting confused by it.