Re: Skipping data on file read

State machines are good, but I don't think you need one here, because:

All comment lines start with 'c'
You don't need to look at a previous line to see if you're in the 'm' state. Unless the line starts with 'c', you are

Try this, after you've opened input file. (Nice job on using the 3 arg form of open, and using warnings and strict btw)

while (<$FILE>){    # Don't need to use a $line variable; leave it in 
+$_
    next if /^c\s/;   # Skip comments
 
    $_ = (split /\$/)[0]  # Throw away everything to the right of $ (i
+f any)

      # I'm not sure from your description if this is the pattern you 
+want; I'm guessing
    foreach my $datum (/ (\w+\.\w\wc) /gx ){  
       push(@data, $datum)}}       # This could be made terser
[download]

I'm assuming that all comment lines start with a 'c' and aren't continued.

BTW, having an both an array and a scalar named 'array' is confusing and unnecessary. In your code

        $array=@array;
        for ( $i=1; $i<$array; $i=$i+2)
[download]

Better to just say

    for ( $i=1; $i < @array; $i=$i+2)
[download]

The '<' puts @array in a scalar context, so $i is compared to the length of @array.

Tell us how it goes
throop

Code not tested

Comment on Re: Skipping data on file read Select or Download Code

Replies are listed 'Best First'.
Re^2: Skipping data on file read by igotlongestname (Acolyte) on Jun 16, 2008 at 15:29 UTC
Throop, thank you so much for your suggestions and tips (I have a lot to learn in the Perl thing, but it's sure fun). I actually didn't look back here after the first two posts about state machines (didn't have the foggiest idea of what to do) so I went after the Tie::File routine and got it to do what I wanted it to do, and will attach the code here. I think in my original code, I am in the 'm' state as you coined it (I like that name) until a comment occurs, but I also screw up if there are two 'm' states in a row (for instance an 'm1 94235 1.0' followed by 'm2 94235 1.0'), without comments in between, since comments aren't obligatory. What happened when two 'm' states in a row is that the first match occurs (say m1 in the preceding example), the data is parsed correctly, but then a $line = <$FILE> call is encountered INSIDE the loop to increment one line (now landing on the m2 line), where subsequently the next iteration of the loop is called and the same call "$line = <$FILE>" then incremented to the line after 'm2' and the m2 data was skipped. What I didn't include, or show is that the file is some thousands of lines long, and the only information I wanted was the data described, but that it also isn't always in the same format, or the same values (i.e. in one file m1 may be 92235.30c, but in another file it may be m1 94241.90c ... all depending upon user input), but it has prescribed formatting. Anyways thanks for the help and encouragement, can feel pretty dumb in these forums :-) If you get a chance, take a look at the code that works how the way I want. I feel that I should be able to make it more concise using the "$." operator, but couldn't see how. Also, I don't see how the "%hash", mapping each element with a 1 condenses the array down into only the unique keys ... is this a built-in feature of the map function in relation to hashes? Thanks! #!/usr/local/bin/perl use strict; use warnings; use Tie::File; print "Enter the filename to analyze (we can hardwire this later): "; chomp ( my $filename = <STDIN> ); open my $FILE, '<', $filename or die "Can't read the source: $!"; open my $CHECK, '>', "Space_Nukes_Rule_$filename" or die "Can't open o +utput file: $!"; open my $OUT, '>', "Out_Space_Nukes_Rule_$filename" or die "Can't op +en output file: $!"; my ($i, $j, $popindex, $array, $arraytemp); my (@array, @subarray, @arraytemp, @data, @line, @INFILE); tie @INFILE, 'Tie::File', $FILE or die "dieeeee"; for ( $i=0; $i<@INFILE; $i++ ) { if ( $INFILE[$i] =~ /^mt?\d+/ ) { @arraytemp = ( split qr/\$/s, $INFILE[$i] ); @array = ( split qr/\s+/s, $arraytemp[0] ); $array=@array; for ( $j=1; $j<$array; $j=$j+2) { push @data, "$array[$j]\n"; } $i++; until ( $INFILE[$i] =~ /^c/ or $INFILE[$i] =~ /^mt?\d+/ ) { @arraytemp = ( split qr/\$/s, $INFILE[$i] ); @array = ( split qr/\s+/s, $arraytemp[0] ); $array=@array; for ( $j=1; $j<$array; $j=$j+2) { push @data, "$array[$j]\n"; } $i++; } $i--; } } print $CHECK "@data\n"; my %hash = map { $_, 1 } @data; my @unique_data = keys %hash; print $OUT "@unique_data"; [download]	[reply] [d/l]

Replies are listed 'Best First'.

Re^2: Skipping data on file read
by igotlongestname (Acolyte) on Jun 16, 2008 at 15:29 UTC

I think in my original code, I am in the 'm' state as you coined it (I like that name) until a comment occurs, but I also screw up if there are two 'm' states in a row (for instance an 'm1 94235 1.0' followed by 'm2 94235 1.0'), without comments in between, since comments aren't obligatory. What happened when two 'm' states in a row is that the first match occurs (say m1 in the preceding example), the data is parsed correctly, but then a $line = <$FILE> call is encountered INSIDE the loop to increment one line (now landing on the m2 line), where subsequently the next iteration of the loop is called and the same call "$line = <$FILE>" then incremented to the line after 'm2' and the m2 data was skipped.

What I didn't include, or show is that the file is some thousands of lines long, and the only information I wanted was the data described, but that it also isn't always in the same format, or the same values (i.e. in one file m1 may be 92235.30c, but in another file it may be m1 94241.90c ... all depending upon user input), but it has prescribed formatting.

Anyways thanks for the help and encouragement, can feel pretty dumb in these forums :-)

If you get a chance, take a look at the code that works how the way I want. I feel that I should be able to make it more concise using the "$." operator, but couldn't see how. Also, I don't see how the "%hash", mapping each element with a 1 condenses the array down into only the unique keys ... is this a built-in feature of the map function in relation to hashes? Thanks!

#!/usr/local/bin/perl
use strict;
use warnings;
use Tie::File;

print "Enter the filename to analyze (we can hardwire this later): ";
chomp ( my $filename = <STDIN> );

open my $FILE, '<', $filename or die "Can't read the source: $!";
open my $CHECK, '>', "Space_Nukes_Rule_$filename" or die "Can't open o
+utput file: $!";
open my $OUT,   '>', "Out_Space_Nukes_Rule_$filename" or die "Can't op
+en output file: $!";

my ($i, $j, $popindex, $array, $arraytemp);
my (@array, @subarray, @arraytemp, @data, @line, @INFILE);

tie @INFILE, 'Tie::File', $FILE or die "dieeeee";

for ( $i=0; $i<@INFILE; $i++ ) {
    if ( $INFILE[$i] =~ /^mt?\d+/ ) {
        @arraytemp = ( split qr/\$/s, $INFILE[$i] );       
        @array = ( split qr/\s+/s, $arraytemp[0] );
        $array=@array;                      
        for ( $j=1; $j<$array; $j=$j+2) {    
            push @data, "$array[$j]\n";        
        }
        $i++;        
        until ( $INFILE[$i] =~ /^c/ or $INFILE[$i] =~ /^mt?\d+/ ) { 
            @arraytemp = ( split qr/\$/s, $INFILE[$i] );  
            @array = ( split qr/\s+/s, $arraytemp[0] );
            $array=@array;
            for ( $j=1; $j<$array; $j=$j+2) {
                push @data, "$array[$j]\n";
            }
            $i++;
        }
        $i--;
    }
}
print $CHECK "@data\n";
my %hash = map { $_, 1 } @data;
my @unique_data = keys %hash;
print $OUT "@unique_data";
[download]

[reply]
[d/l]