surfmonkey has asked for the wisdom of the Perl Monks concerning the following question:

Hi, probably a dumb question but I'm looking for a simple way to parse a text file that consists of three headers and multiple lines in between for example:
(+Header1)
...
....
(-Header1)

(+Header2)
......
......
......
(-Header2)

(+Header3)
......
......
.....
(-Header3)
I don't want to divide the dataset into three separate files. Anyone got any ideas?

edited: Fri Jun 28 00:13:37 2002 by jeffa - added pre tags

Replies are listed 'Best First'.
Re: Text file filtering
by dimmesdale (Friar) on Jun 27, 2002 at 15:10 UTC
    If I understand your question right, something like this may do what you want:

    $your_data =~ /\+Header1(.|\n+)-Header1\n\+Header2(.|\n+)-Header2\n\+H +eader3(.|\n+)-Header3\n/;

    I wouldn't suggest using that as anything but a starting point, though. There are regex options for changing the . operator to match \n's, and you may want to consider doing a line by line parse (and having boolean--i.e., flag--variables to tell you what section you're in).

    The last suggestion might be started like this:

    while ($line = <your_file_handle>) { if($line =~ /^\+Header1/) { $header1 = 1; next } # if +Header1 has its own line use next elsif($line =~ /^-Header1/) { next } else { $header1 = 0 } . . . if($header1) { $h1_data[$index++] = $line; } . . . }

    Again, these are just quick things to get you started thinking; you may want to look at this site and search for regex, or look at the perlman pages for more information regarding them.

      Thanks, what I want to do is match the block of data between the two headers so that I can put it into a separate variables so that at the end I have three variables containing the data from between the separate headers. I just can't get my head around using regexp's :o(
Re: Text file filtering
by bwana147 (Pilgrim) on Jun 27, 2002 at 15:40 UTC

    IIUC, the headers stand on a line by themselves. Then, you might want to have a look at the flip-flop operator (..).

    perldoc perlop

    while ( <> ) { if ( /\(\+Header1\)/ .. /\(-Header1\)/ ) { # you're in part one } if ( /\(\+Header2\)/ .. /\(-Header2\)/ ) { # you're in part two } if ( /\(\+Header3\)/ .. /\(-Header3\)/ ) { # you're in part three } }

    HTH

    --bwana147