in reply to Re: using lookaround assertions to grab info
in thread using lookaround assertions to grab info

Note that this code is quite simple and will only work if only one section continues accross multiple lines, if you need more than one sections that handles multiple lines the same basic idea can work, but it takes more work.
Thanks for the advice. I did think of such an approach and then discarded it for the very reason you state above. I'll look at it again and see if I can finagle something useful.

I guess the best way to state the problem is that the value of any label continues until a new label is encountered even if \n is encountered on the way. The labels are distinguished by \s*\w*\s:

  • Comment on Re^2: using lookaround assertions to grab info

Replies are listed 'Best First'.
Re^3: using lookaround assertions to grab info
by ryantate (Friar) on Jun 03, 2004 at 21:48 UTC
    Why not
    my $section = ''; #remember the last section label we encountered foreach (@m) { if (/^Dig No\s:\s(\w*)\s*Prior:\s*([0-9]*)\s*Digstrt:\s*([0-9]{ +2}\/[0-9]{2}\/[0-9]{2})\s*Time:\s*([0-9]{2}:[0-9]{2})/) { $section = 'DIGTIME'; $m{'DIG_NO' } = $1; $m{'PRIORITY'} = $2; $m{'DIGDATE' } = $3; $m{'DIGTIME' } = $4; } elsif (/^Address\s*:\s*(.*)Subdivsn/) { $section = 'ADDRESS'; $m{'ADDRESS' } = $1; } elsif (/^Remarks\s*:\s*(.*)/ ) { $section = 'REMARKS'; $m{'REMARKS' } = $1; } elsif (/^\s*:\s*(.+?)\s*/) { $m{$section} .= $1; } }
Re^3: using lookaround assertions to grab info
by Ven'Tatsu (Deacon) on Jun 03, 2004 at 21:45 UTC
    Have you though of extracting the match before the if elsif ... in the foreach loop?
    my $section = ''; foreach (@m) { if (/^\s*([\w\s]*?)\s*:/ && $1) { #if we matched and we captured a s +ection label $section = $1; } if ($section eq '...') { ... }
      @m is an array of lines and some lines have multiple sections, ergo this exact code would not work. Or am I missing something?
        It's very likely I could be wrong, I haven't tested it, but I think it would work.
        my $section = ''; foreach (@m) { if ( /^ #match the start of the line \s* #match 0 or more whitespace characters grab as many as we + can ( #start captureing [\w\s]*? #match 0 or more word or whitespace, grab as _few_ + as posible ) #end capture \s* #match 0 or more whitespace characters : #until we reach the a colon /x #x added for the comments && #if the match fails we don't bother checking $1 $1) { #check if we captured any thing if there was only whitesp +ace before the colon we won't have captured any thing $section = $1; #if we actualy found a section label save it } #$section should now be set to the last section label we found. if ($section eq '...') { ... }