in reply to Re^2: Improve foreach with help of map?
in thread Improve foreach with help of map?

The code I gave you gives you the first match. Without having the process the entire string splitting it into separate lines first.
  • Comment on Re^3: Improve foreach with help of map?

Replies are listed 'Best First'.
Re^4: Improve foreach with help of map?
by mickep76 (Beadle) on Oct 09, 2009 at 14:34 UTC

    You are indeed correct, my bad. It does thought have the problem in this case to make the regex very complex since the first match kind of identifies position and then the second rexegp gets the actual value. Between the first and second match there is ALOT of crap.

    In this case it look something like.

    <td><b>End Date</b></td>^M <td><img src="http://welcome.hp-ww.com/img/s.gif" idth="1" height="1" +border="0" alt=""></td>^M <td>^M 19 Jun 2012^M

    So the regexp would first have to match "End Date" and then the actual date and ignore everything between.

    I tried something like:

    my ($result) = $content =~ /End Date.*(\d\d \w\w\w \d\d\d\d)/;

    But it seems there's just to much crap between the lines.

    The fine example of HTML is from HP. It's feels like 1994 all over again.
      For completness, the long version not using a regex. :-)
      #!/usr/bin/perl use warnings; use strict; use HTML::TreeBuilder; my $html = do{local $/;<DATA>}; my $t = HTML::TreeBuilder->new_from_content($html); my $start; for my $td ($t->look_down(_tag => q{td})){ $start++, next if $td->as_text eq q{End Date}; next unless $start; next if $td->look_down(_tag => q{img}); print $td->as_text; last; } __DATA__ <td><b>End Date</b></td> <td><img src="http://welcome.hp-ww.com/img/s.gif" width="1" height="1" + border="0" alt=""></td> <td> 19 Jun 2012
      I just gave basically the same solution. The difference it that mine works ;) You'll need two fixes to make yours work:
      • Use the /s modifier. /./ needs to match newline.
      • Change /.*/ to /.*?/. You want to next date, not the last date.
      Use this:
      $content =~ /End Date/g and $content =~ /(\d\d \w\w\w \d\d\d\d)/g and +my $result = $1;
        Nice, although second /g should be omitted