in reply to Matching text between tags

Your initial string suffers from interpolation. Changing to single quotes, using the /s modifier and testing gives us the desired results:

#!/usr/bin/perl use strict; use warnings; use Test::More; my $lines = 'Aliquam vitae ipsum id felis finibus congue. Ut molestie +scelerisque purus, sit amet rhoncus leo aliquet ac. In eu lobortis quam. Maecenas auctor + semper enim, ut convallis sapien dictum eu. Sed arcu ex, ornare et porttitor vitae +, interdum a mi. Mauris rutrum luctus rhoncus. Quisque velit quam, convallis vel est a +t, tincidunt accumsan velit. Fusce ut <u>metus ut which may either exceed $1,000.00 or OK. G. LAT, semper nunc, in dictum magna.</u> Aliquam ac vestibulum dolor. Praesent in magna nisi. Cras nec viverra + ligula. Suspendisse efficitur imperdiet eros, <u>sed rhoncus sapien euismod cursus. Vesti +bulum a posuere</u> elit, eget tristique eros. Etiam et lectus venenatis, aliquet dui vitae, po +suere lectus.'; my ($first, $second) = ($lines =~ /<u>(.*?)<\/u>/sg); is ($first, 'metus ut which may either exceed $1,000.00 or OK. G. LAT, semper nunc, in dictum magna.', 'First match'); is ($second, 'sed rhoncus sapien euismod cursus. Vestibulum a posuere' +, 'Second match'); done_testing ();
$ perl 1160410.pl ok 1 - First match ok 2 - Second match 1..2

If that doesn't fix your problem you'll need to be a lot more specific about how it fails for you.

Replies are listed 'Best First'.
Re^2: Matching text between tags
by Anonymous Monk on Apr 14, 2016 at 16:26 UTC
    Why this code can't get the second match:
    #!/usr/bin/perl use strict; use warnings; my @lines = ("Aliquam vitae ipsum id felis finibus congue. Ut molesti +e scelerisque purus, sit amet rhoncus leo aliquet ac. In eu lobortis quam. Maecenas auctor + semper enim, ut convallis sapien dictum eu. Sed arcu ex, ornare et porttitor vitae +, interdum a mi. Mauris rutrum luctus rhoncus. Quisque velit quam, convallis vel est a +t, tincidunt accumsan velit. Fusce ut <u>metus ut which may either exceed \$1,000.00 or OK. G. LAT +, semper nunc, in dictum magna.</u> Aliquam ac vestibulum dolor. Praesent in magna nisi. Cras nec viverra + ligula. Suspendisse efficitur imperdiet eros, <u>XXsed rhoncus sapien euismod cursus. Ves +tibulum a posuereYY</u> elit, eget tristique eros. Etiam et lectus venenatis, aliquet dui vitae, po +suere lectus."); #while (defined( my $lines = shift @lines)){ #foreach my $lines (@lines){ for my $lines (@lines){ if( $lines =~ /<u>(.*?)<\/u>/sg ){ print "\n $1\n"; } }
      Why this code can't get the second match ...

      Because the  /g modifier in scalar context (which is supplied by evaluating the match as part of an  if or  while condition expression) will cause an  m//g match to match only once per evaluation. The  if block only executes once if the condition is true. The  while block continues to execute until the conditional is no longer true.

      c:\@Work\Perl\monks>perl -wMstrict -le "my $s = qq{foo <u> match \n the first </u> bar <u> second \n match </ +u> baz}; print qq{[[$s]] \n}; ;; if ($s =~ m{ <u> (.*?) </u> }xmsg) { print qq{if: '$1'}; } print ''; ;; pos $s = 0; while ($s =~ m{ <u> (.*?) </u> }xmsg) { print qq{while: '$1'}; } " [[foo <u> match the first </u> bar <u> second match </u> baz]] if: ' match the first ' while: ' match the first ' while: ' second match '

      Note: The  pos $s = 0; statement is needed in the example because each string keeps track of its own match position, and that match position is used in  /g matching. Try eliminating the statement from the code and see what happens. Also try printing pos at various strategic points in execution.

      Update: Also look in Regexp Quote-Like Operators in perlop for discussion of  m/PATTERN/msixpodualgc and look for the phrase 'In scalar context, each execution of "m//g" ...'     (Update: See also Global matching in perlretut.)


      Give a man a fish:  <%-{-{-{-<

        I didn't cause the text comes in an array format, getting confused by it.