in reply to Extract table from a block of text

The flip-flop operator can tell you whether you're between the given lines. No need to hash anything as the output depends on the current line only.
#!/usr/bin/perl use warnings; use strict; while (<DATA>) { if (my $line = /^INFO START$/ .. /^END$/) { next if /^$/ # Skip empty lines. or $line =~ /E/ # Skip the END line. or 1 == $line # Skip the START line. or /^STIME/; # Skip the header. my ($stime, $etime, @cols) = split; print "$stime:$etime\t$_\n" for @cols; print "\n"; } } __DATA__ ... ignore ... INFO START STIME ETIME COLUMN3 COLUMN4 COLUMN5 aaaa1 bbb1 ccc1 ddd1 eee1 aaaa2 bbb2 ccc2 ddd2 eee2 aaaa3 bbb3 ccc3 ddd3 eee3 aaaa4 bbb4 ccc4 ddd4 eee4 END ... ignore again ...
لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

Replies are listed 'Best First'.
Re^2: Extract table from a block of text (updated)
by LanX (Saint) on Sep 21, 2014 at 11:07 UTC
    Hi Choroba,

    As a side note:

    Instead of parsing the sequence number $line you could apply the technique described in Re^4: grep trouble (body of flip-flop range) to skip the boundaries of the flip and the flop. :)

    Cheers Rolf

    (addicted to the Perl Programming Language and ☆☆☆☆ :)

    UPDATE

    in hindsight it's a bad idea to use something like:

    #!/usr/bin/perl use warnings; use strict; while (<DATA>) { if (/^INFO START$/ .. /^END$/ and not //) { print "$_"; } } __DATA__ ... ignore ... INFO START STIME ETIME COLUMN3 COLUMN4 COLUMN5 aaaa1 bbb1 ccc1 ddd1 eee1 aaaa2 bbb2 ccc2 ddd2 eee2 aaaa3 bbb3 ccc3 ddd3 eee3 aaaa4 bbb4 ccc4 ddd4 eee4 END ... ignore again ...

    While it does only print the inner range ...

    STIME ETIME COLUMN3 COLUMN4 COLUMN5 aaaa1 bbb1 ccc1 ddd1 eee1 aaaa2 bbb2 ccc2 ddd2 eee2 aaaa3 bbb3 ccc3 ddd3 eee3 aaaa4 bbb4 ccc4 ddd4 eee4
    ... it's vulnerable to mess up the empty match // (i.e. match again the last successfully matched regular expression) by any other regex happening within the if-branch. :-/

    The usual trap of global dependencies!

Re^2: Extract table from a block of text
by Laurent_R (Canon) on Sep 21, 2014 at 10:48 UTC
    This works perfectly with the dummy data provided in the original post, but the regex to skip the END line might be a bit dangerous because real data might contain a 'E'. In addition, if the file is large, it might be better to do a last, rather than a next, when the line with the END tag is met.
      > but the regex to skip the END line might be a bit dangerous because real data might contain a E

      That's a misunderstanding, $line holds a sequence number which comes only in exponential notation (like 7E0) iff the flip-flop terminates.

      Has nothing to do with the END marker! :)

      Cheers Rolf

      (addicted to the Perl Programming Language and ☆☆☆☆ :)

        Yes, you are absolutely right, I looked at it to quickly and confused $_ and $line. Sorry for that silly comment.