Help With Parsing a File

PrimeLord has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks! I know there has to be an easy solution to my problem, but I just can't seem to find it. I have a data file I am trying to parse. Here is an example of what the data may look like.

1-23-abc45 (11:01)    ABC SET foo foo foo.
Foo data foo.
--
1-23-cba45 (12:02)     ABC RUN foo foo foo.
Foo data foo.
--
2-34-xyz21 (12:03)     ABC SET foo foo foo.
Foo data foo.
!
[download]

It's something similar to that. Now what I want to do is everytime I see ABC SET I want to look and see if two lines below it is --. If it is than I just need to ++ a variable. If not I want to just move to the next line.

I am lost on how to look two lines below once I have matched a line. Normally I would have posted the code I have tried, but I just can't come up with anything and I am sure it is simple to do. I suspect maybe Parse::RecDescent would be useful here, but I have taken a look at it and I just can't wrap my brain around how to use it. Any suggestions you might have would be very appreciated. Here is basically the code I am thinking of without the matching part.

use strict;

my $value;
open IN, "data.file" or die "$!";
while (<IN>) {
       chomp;
       if (/match the lines/) {
              $value++;
       }
}
close IN or warn "$!";
[download]

Again thanks for any suggestions.

- Prime

Comment on Help With Parsing a File Select or Download Code

Replies are listed 'Best First'.
Re: Help With Parsing a File by gjb (Vicar) on Jan 17, 2003 at 22:52 UTC
Rather than look to the second line ahead, you can look two lines back when reaching a '--'. Just a change of perspective that makes things much easier. Just my 2 cents, -gjb-	[reply]
(jeffa) Re: Help With Parsing a File by jeffa (Bishop) on Jan 17, 2003 at 23:40 UTC
If your data is relatively small-ish, you could slurp everything into an array and rely on indices. Assuming that the -- will always appear 2 lines down, you could do something like: `my @data = <IN>; my $value = 0; for (0..$#data) { if ($data[$_] =~ /ABC SET/ and $data[$_+2] =~ /^--/) { $value++; } }` [download] Your suspicisions about Parse::RecDescent are correct, it would be useful here, but it also takes a long time to get it right. If a quick fix will get the job the done then take it and run, i would only investigate a P::RD solution if i needed to do something more complex with the data. jeffa L-LL-L--L-LL-L--L-LL-L-- -R--R-RR-R--R-RR-R--R-RR B--B--B--B--B--B--B--B-- H---H---H---H---H---H--- (the triplet paradiddle with high-hat)	[reply] [d/l]
Re: Help With Parsing a File by tall_man (Parson) on Jan 18, 2003 at 00:04 UTC
You could maintain a small shift buffer with the last two lines read, which would work even when the data file is large. `use strict; my $value = 0; my @linebuf; open IN, "data.file" or die "$!"; for (0..1) { $_ = <IN>; push @linebuf,$_; } while (<IN>) { if (/^--/ and $linebuf[0] =~ /ABC SET/) { print "Matched ",$linebuf[0]; $value++; } shift @linebuf; push @linebuf,$_; } close IN or warn "$!"; print "total found is $value\n";` [download]	[reply] [d/l]
Re: Help With Parsing a File by mirod (Canon) on Jan 18, 2003 at 00:09 UTC
Your "records" clearly include 3 lines, so just read the file 3 lines at a time: `#!/usr/bin/perl -lw use strict; my @lines; my $count=0; while( @lines[0..2]=( <DATA>, <DATA>, <DATA>)) { $count++ if( ($lines[2]=~ m{^--$}) and ( index($lines[0], 'ABC SET +') != -1)); } print "count: $count"; __DATA__ 1-23-abc45 (11:01) ABC SET foo foo foo. Foo data foo. -- 1-23-cba45 (12:02) ABC RUN foo foo foo. Foo data foo. -- 2-34-xyz21 (12:03) ABC SET foo foo foo. Foo data foo. !` [download]	[reply] [d/l]
Re: Help With Parsing a File by runrig (Abbot) on Jan 18, 2003 at 00:15 UTC
Yet another way utilizing `$/`: `$/="\n--\n"; my $value; while (<>) { my @arr = split "\n"; if (@arr >= 3 and $arr[-3] =~ /ABC SET/) { $value++; } }` [download]	[reply] [d/l] [select]
Re: Re: Help With Parsing a File by OM_Zen (Scribe) on Jan 18, 2003 at 04:07 UTC
Hi, `while(<fname>){ $/="\n--\n"; ($_ =~ /ABC SET/)?$values++:next; }` [download]	[reply] [d/l]
Re: Re: Re: Help With Parsing a File by runrig (Abbot) on Jan 18, 2003 at 17:16 UTC
Not exactly what the OP was asking for. He wants "ABC SET" on one line, and a line with nothing but "--" two lines below it. Your solution will find "ABC SET" and "--", but we don't know how many lines apart they are. This might work though: `$/="\n--\n"; while (<FILE>) { $value++, next if /.?ABC SET.\n.*\n--\n/; }` [download]	[reply] [d/l]


more useful options
	PerlMonks