Hope you understood Corion's reply. "while (<DATA>)" causes $_ be set to the next line in __DATA__ at every iteration.

This construct: my $num = /regex/.../regex/ uses what is also known as the flip-flop operator. A classic post on this by Grandfather is: Flipin good, or a total flop?.

A single regex that matches will have a true value, I think a numeric 1 is returned. In the case where 2 regex's are joined by the ... operator, a line number is returned representing which line of the record we are on.

I would suggest that you put a print "num=$num\n"; statement in the loop and watch what happens. You will see values like: 1,2,3,4E0.

The 4E0 means that something is different about this line number! And indeed there is. It is the line that contains the ']' character (the last line of the record - the line that matches the 2nd regex). The E0 is just exponential notation meaning 10**0. Any number raised to the zero'th power is 1. So 4E0 = 4 * 10**0 = 4 * 1 = 4 from a numeric perspective. So this is a clever way to return 2 pieces of information with a single number. A number in exponential format means the record is over and if I wanted to do some math on this number, it is a perfectly legitimate representation of the number 4.

Update:
I could have written the code with a more conventional parsing scheme. When the first line of a record is detected, call a subroutine which processes lines until the last line of the record is detected. This eliminates the need to have some flag values like "I'm inside the record now..", etc. The flip-flop implementation essentially does what the below would have done:

#!/usr/bin/perl -w use strict; while (<DATA>) { process_record() if /^\[/; #start of record } sub process_record { my %record; my $line; my $line_num=1; while (defined ($line = <DATA>) and $line !~ /\]/) { print "line= ",$line_num++," ",$line; # do splits and fill in %record here } print "Record Complete!\n\n"; # use %record here to populate other hashes # %record is thrown away when sub returns. } =prints line= 1 ID: 123 line= 2 Start: /tmp/file.1 /tmp/file.2 /tmp/file.3 line= 3 Done: /complete/success.1 /complete/success.2 Record Complete! line= 1 ID: 456 line= 2 Start: /complete/success.1 /complete/success.2 /tmp/fil +e.3 line= 3 Done: /complete/success.3 /complete/success.4 Record Complete! =cut __DATA__ [ ID: 123 Start: /tmp/file.1 /tmp/file.2 /tmp/file.3 Done: /complete/success.1 /complete/success.2 ] [ ID: 456 Start: /complete/success.1 /complete/success.2 /tmp/file.3 Done: /complete/success.3 /complete/success.4 ]

In reply to Re^3: Parsing a file and finding the dependencies in it by Marshall
in thread Parsing a file and finding the dependencies in it by legendx

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.