bh_perl has asked for the wisdom of the Perl Monks concerning the following question:


hi..

Below is my program to read binary file which are contaion File Header, Block Header, Data and Block Trailer.
#!/usr/bin/perl -w use Cwd; use warnings; use strict; use Getopt::Long; use constant BLOCKHDR => 14; use constant BLOCKTRL => 77; use constant FILEHDR => 20; use constant BLOCKLEN => 1531; use constant CDRLEN => 160; use constant MAXRECORD => 9; my ($trace, $help, $infile, $swap); my $indir = getcwd; my $outdir = getcwd; GetOptions ( "h|help" => \$help, "filename|f=s" => \$infile, "swap|s" => \$swap, "input|i=s" => \$indir, "output|o=s" => \$outdir, "trace|t" => \$trace ) or usage(); my $template = "A2 A10 A2 A14 A2 A2 A2 A2 A2 A2 A4 A2 A4 A4 A4 A4 A7 +A3 A1 A3 A3 A1 A1 A18 A61"; my @fldsize = qw(A2 A10 A2 A14 A2 A2 A2 A2 A2 A2 A4 A2 A4 A4 A4 A4 A7 +A3 A1 A3 A3 A1 A1 A18 A61); #my $template = "A4 A20 A4 A28 A4 A4 A4 A4 A4 A4 A8 A4 A8 A8 A8 A8 A14 + A6 A2 A6 A6 A2 A2 A36 A122"; my @fldname = ( "Record Type", "Originating Number", "A Subscriber Cat +egory", "Reserved 1", "Starting Date-Year", "Starting Date-Mon +th", "Starting Date-Day", "Starting Time-Hour", "Starting Time- +Minute", "Starting Time-Second", "Duration Time-Minute", "Duration Time-Sec +ond", "Incoming Route Name", "Incoming Circuit Name", "Outgoing Rout +e Name", "Outgoing Circuit Name", "Charge Meter", "MBI", "Long Duration Flag", "Bearer Service", "TeleService", "Subscriber Type", "IDD Indicator", "Termination Number", "Reserved 2" ); sub usage { print ("USAGE: $0 -i <input_folder> -o <output_folder> -f <input_f +ilename>\n\n"); print ("NAME\n"); print ("\t$0 - Swap anumber and bnumber record for Neax File Forma +t\n\n"); print ("DESCRIPTION\n"); print ("\t-i or input : Assigned input folder name\n"); print ("\t-o or output : Assigned output folder name\n"); print ("\t-f or filename : Assigned input file name\n\n"); print ("EXAMPLE\n"); print ("Convert Neax file format from specific input dir and produ +ce the output into specific output dir\n"); print ("\t#$0 -i /acec/nv2am/data/input/NXGLM -o /acec/nv2am/data/ +output/INBILL/NX160 -f NXGLM.P523600_523699.dat\n\n"); print ("Convert Neax file format from specific input dir and produ +ce the output into current output dir\n"); print ("\t#$0 -i /acec/nv2am/data/input/NXGLM -f NXGLM.P523600_523 +699.dat\n\n"); print ("Convert Neax file format from current input dir and produc +e the output into specific output dir\n"); print ("\t#$0 -o /acec/nv2am/data/input/NXGLM -f NXGLM.P523600_523 +699.dat\n\n"); print ("Convert Neax file format from current input dir and produc +e the output into current output dir\n"); print ("\t#$0 -f NXGLM.P523600_523699.dat\n"); exit; } my $outfile = $infile; my ($data, $tmp); if ($infile) { open (OUTPUT, ">$outdir/$outfile"); open (DATA, "$indir/$infile"); binmode DATA; # Read File Header read (DATA, $data, FILEHDR); printf (OUTPUT "$data"); if ($trace) { my $fhdr = unpack "H20", $data; printf ("FILE HEADER : %s\n", $fhdr); } # Read Block while (read (DATA, $data, BLOCKLEN)) { # Read Block Header read (DATA, $data, BLOCKHDR); printf (OUTPUT "$data"); if ($trace) { my $blockhdr = unpack "H14", $data; printf ("\nBLOCK HEADER : >%s<\n", $blockhdr); } # Read 9 record for each block and record len is 190 byte for +each record for (my $cdr=0; $cdr < MAXRECORD; $cdr++) { read (DATA, $data, CDRLEN); my @dd = unpack $template, $data; print ("\nRECORD $cdr\n"); # Swap anumber with bnumber for each record if (defined $swap) { $tmp = $dd[1]; $dd[1] = $dd[22]; $dd[22] = $tmp; } for (my $i=0; $i < @dd; $i++) { printf (OUTPUT "%s", pack ($fldsize[$i], $dd[$i])); printf ("%-30s: >%s<\n", $fldname[$i],$dd[$i]) if ($tr +ace); } } # Read Block Trailer read (DATA, $data, BLOCKTRL); printf (OUTPUT "$data"); if ($trace) { my $blocktrl = unpack "H77", $data; printf ("BLOCK TRAILER : >%s<\n\n", $blocktrl); } } close(DATA); close(OUTPUT); }

The problem is my program is display the last record with NULL and it is not exist on the binary input file. Below is the sample:
BLOCK HEADER : >< RECORD 0 Record Type : >< Originating Number : >< A Subscriber Category : >< Reserved 1 : >< Starting Date-Year : >< Starting Date-Month : >< Starting Date-Day : >< Starting Time-Hour : >< Starting Time-Minute : >< Starting Time-Second : >< Duration Time-Minute : >< Duration Time-Second : >< Incoming Route Name : >< Incoming Circuit Name : >< Outgoing Route Name : >< Outgoing Circuit Name : >< Charge Meter : >< MBI : >< Long Duration Flag : >< Bearer Service : >< TeleService : >< Subscriber Type : >< IDD Indicator : >< Termination Number : >< Reserved 2 : ><

Replies are listed 'Best First'.
Re: Reading binary file is not accurate ?
by BrowserUk (Patriarch) on Apr 27, 2010 at 18:36 UTC

    BLOCKLEN (1531) = BLOCKHDR(14) + MAXRECORD(9) * CDRLEN(160) + BLOCKTRL(77), but ...

    # Read Block while (read (DATA, $data, BLOCKLEN)) { # Read Block Header read (DATA, $data, BLOCKHDR);

    You read the whole block into $data and then attempt to read the individual parts, promptly overwriting (and therefore discarding) $data.

    You should either:

    1. read the whole block and then extract the header/CDAs/trailer from what you read($data); or
    2. Read the header/CDAs/trailers individually.

    Not do both!


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Hi..
      Could you give some example..

        The first method I described--read the entire block, and then extract the header, records and trailer from it--would look something like this:

        while( read DATA, $data, BLOCKLEN ) { ## Extract the header from the front of the block my $header = substr $data, 0, BLOCKHDR, ''; ## do stuff with $header for my $cdr ( 1 .. MAXRECORD ) { ## Extract each record in turn from $data my $record = substr $data, 0, CDRLEN, ''; ## do stuff with $record } my $trailer = $data; ## Anything left in $data should be your trai +ler. ## do something with it. }

        The second method I described might be done something like this:

        until( eof DATA ) { read DATA, my $header, BLOCKHDR or die; ## Do stuff with $header; for my $cdr ( 1 .. MAXRECORD ) { read DATA, my $record, CDRLEN or die; ## Do stuff with $record } read DATA, my $trailer, BLOCKTRL or die; ## Do stuff with $trailer }

        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Reading binary file is not accurate ?
by graff (Chancellor) on Apr 27, 2010 at 18:02 UTC
    You should check whether the "open" statement succeeds on the input file:
    open (DATA, "$indir/$infile") or die "Open failed on $indir/$infil +e: $!\n";