in reply to Re^2: File Parsing
in thread File Parsing

This is much better. I took a closer look at your original code, and came up with a simplified version. If I understand the problem, you want to make sure you can read through the "AtrFile" once without any problems, to make sure you can get to the end of it correctly, and if that works, then you want to read it again and "decode" some of the content into another file. If so, then the biggest issue is to improve your error checking.

Also, I'm no jedi when it comes to pack/unpack, but I was struck by your use of  "a" . $Lngth to unpack "$num_block" into "$LblVal". I think this is an unnecessary use of unpack, because the packed and unpacked values turn out to be identical. Consider:

$packed = join "", map { chr() } (0x01..0x20); $plen = length( $packed ); $tmpl = "a" . $plen; print "unpacking $plen bytes using (a): "; $unpacked = unpack( $tmpl, $packed ); $result = ( $packed eq $unpacked ) ? "same" : "different"; print "$result\n";'

For me that outputs "unpacking 32 bytes using (a): same". (Update: you can change the first line to "0x00..0xff", and the result on 256 bytes will still be "same".)

Anyway, as for the simplified version of your code, you'll see that I prefer to write a subroutine rather than write the same block of code more than once. Also, I use "seek" to get back to the start of the file for the second pass, rather than closing and reopening. Finally, I add a lot of error checking, and "die" with suitable messages when things don't go as expected. (There are other ways, but this is a decent start.) I threw in enough bogus variable declarations so as to retain all your original variables and still pass "strict" (but apart from that, it's untested, of course):

my $iIcao = "???"; my $AtrFile = "some_file_name"; open( my $atrfh, $AtrFile ) or die "$AtrFile: $!"; binmode $atrfh; my $base_readsize = 8; my $base_block; if (( read $atrfh, $base_block, $base_readsize ) != $base_readsize ) { die "$AtrFile: read failed on first $base_readsize bytes: $!"; } my ($Cksum, $NxIdx) = unpack( 'II', $base_block ); my $offset = $base_readsize; my ( $lastrec, $bytecount ) = parse_records( $atrfh, $offset, "debug" +); # parse_records will die if there are problems with the file data print "$AtrFile: $lastrec records, $bytecount bytes read okay\n"; # Now that AtrFile has passed the sanity checks, start print to Type09 +File; # Just rewind $atrfh to the first LBL record and repeat the read loop print Type09File "$iIcao Set_Header: $Cksum, $NxIdx\n"; seek $atrfh, $base_readsize, 0; parse_records( $atrfh, $offset, "print" ); close( $atrfh ); sub parse_records { my ( $rfh, $offset, $mode ) = @_; my $lbl_readsize = 20; my ( $lbl_block, $num_block ); my $bytes_read; my $recid = 0; my $lbltmpl = 'i i b8 b16 b8 i C'; while (( $bytes_read = read $rfh, $lbl_block, $lbl_readsize ) == $ +lbl_readsize ) { $recid++; $offset += $lbl_readsize; my ( $NumLbl, $LblKnd, $ZmLvl, $FntSz, $res, $Lngth, $Ornt ) = unpack( $lbltmpl, $lbl_block ); if (( read $rfh, $num_block, $Lngth ) != $Lngth ) { die "$AtrFile: can't read $Lngth bytes at rec# $recid (off +s: $offset): $!"; } # the following assumes that there is an open file handle # called "Type09File" -- might be better to make this a # "my" variable and pass it as an arg... if ( $mode eq 'print' ) { print Type09File "Label_Header: " . join( ", ", $NumLbl, $LblKnd, $ZmLvl, $FntSz, $res, $L +ngth, $Ornt, "$num_block\n" ); } } if ( $bytes_read > 0 ) { die "$AtrFile: got $bytes_read bytes, not $lbl_readsize ". "after rec# $recid (offs: $offset)\n"; } elsif ( $bytes_read < 0 ) { die "$AtrFile: read error ($bytes_read) after rec# $recid (off +s: $offset)\n"; } return ( $recid, $offset ); }
As for improving overall speed, I don't have anything to offer on that -- if it's your code that's causing the delay, I'm guessing the problem is somewhere other than the part you've shown us. Again, if you scatter some timing reports around, you'll get a better idea where to look.