tariqahsan has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to read a file which looks something like
this -
RULE=test BEGIN NUMOFRULES=2 STMT1=BEGIN A B C STMT1=END STMT2=BEGIN X Y Z STMT2=END
I can have any number of these STMT blocks

My code snippet is like -

open (RULE, "$rulefile") || die "Can't open $rulefile for reading $!\n +"; while ( defined ($_ = <RULE>) ) { if (/RULE=$rule/../^END$/) { chomp $_; if ($_ =~ /^NUMOFRULES/) { # Get the number of rules @rulenos = split(/=/); } for ($i = 1; $i <= $rulenos[1]; $i++) { if (/^STMT$i=BEGIN$/../^STMT$i=END$/) { if ( !($_ =~/^STMT/)) { print $_; } } } } } -Getting- A A B B C C ... -Want to get- A B C ...
Any idea how I can achieve this?

Replies are listed 'Best First'.
Re: Problem in using range operator
by tlm (Prior) on May 03, 2005 at 22:18 UTC

    Note that in this case only your inner loop is doing any noticeable work (the printing), so you can just get rid of the outer loop.

    while ( <RULE> ) { if (/^STMT/.../^STMT/) { print unless /^STMT/; } }
    ...or if you are into the whole brevity thang:
    /^STMT/.../^STMT/ and !/^STMT/ and print while <RULE>; __END__ A B C Y Z
    I used the ... operator instead of .. so that I could get away with the same pattern for both the left and the right sides of the operator, but I could have also used:
    if (/^STMT\d+=BEGIN/../^STMT\d+=END/)

    the lowliest monk

Re: Problem in using range operator
by Transient (Hermit) on May 03, 2005 at 22:11 UTC
    Your inner for loop is processing the same line twice (or more precisely: it processes the same line the same number of times as the total number of rules you have, in this case, two).

    Sounds like what you need is a flag to indicate either that you are currently between a statment declaration and end or to keep track of which statement you are currently in.

    Is it necessary for you to use the range operator to grab the statements (e.g. A, B, C?)
      Actually here I am using the range operator more of a flipflop boolean state operator for the left and right operand. I am using the STMT?=BEGIN and STMT?=END as begin and end markers in the data file. My problem is I can have any number of these sort of blocks. The outer most block is the RULE & END. This too can be in multiple numbers. I am using the NUMOFRULES to have the number of the STMT? blocks in the data file. I know that the for loop is printing the same line as many times as NUMOFRULES value. I tried using flags. But could'nt get it right.
Re: Problem in using range operator
by johnnywang (Priest) on May 03, 2005 at 22:50 UTC
    I think the problem is that when the ".." evaluates the left hand side and it becomes true, it will remain true until the right hand side becomes true. So once you are in range for, say line "A" becuase of STMT1=BEGIN, it will still be in range when you test /STMT2=BEGIN/../STMT2=END/ since it hasn't seen /STMT2=END/ yet. The following works ok, but not sure it's exactly what you want (BTW, "use strict" might help.)
    while ( defined ($_ = <DATA>) ) { if (/RULE=$rule/../^END$/) { chomp $_; if ($_ =~ /^NUMOFRULES/) { # Get the number of rules @rulenos = split(/=/); } my $e = $rulenos[1]; if (/^STMT1=BEGIN$/../^STMT$e=END$/) { if ( !($_ =~/^STMT/)) { print $_; } } # for ($i = 1; $i <= $rulenos[1]; $i++) { # if (/^STMT$i=BEGIN$/../^STMT$i=END$/) { # if ( !($_ =~/^STMT/)) { # print $_; # } # } # } } } __DATA__ RULE=test BEGIN NUMOFRULES=2 STMT1=BEGIN A B C STMT1=END STMT2=BEGIN X Y Z STMT2=END
    Update. a subtle point which I didn't know until thinking about this problem is that: each .. keeps its own state of in range or not in range, here "each" is NOT dependent on what the left-right operators are, so each iteration of a loop only counts as one range operator, and one state, as shown by the following code:
    use strict; my @data = <DATA>; foreach(@data){ chomp; if(/START1/../END1/){ print "1: $_\n"; } if(/START2/../END2/){ print "2:$_\n"; } } print "\n\n"; foreach (@data){ foreach my $i(1..2){ if(/START$i/../END$i/){ print "$i:$_\n"; } } } __DATA__ START1 A B C END1 START2 X Y Z END2
    output is as follows, notice the difference:
    __OUTPUT__ 1: START1 1: A 1: B 1: C 1: END1 2:START2 2:X 2:Y 2:Z 2:END2 1:START1 2:START1 1:A 2:A 1:B 2:B 1:C 2:C 1:END1 2:START2 1:X 2:X 1:Y 2:Y 1:Z 2:Z 1:END2 2:END2