Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I'm new to Perl and running into difficulty with a Perl routine I've written. Basically, I have an EBCDIC file without carriage returns that is quite large. The routine I've written functions fine until I throw my "while" statement in at which point I get an "out of memory" error. Help pls! Thanks in advance..Chris
#!/usr/bin/perl use strict; use warnings; use lib '/home/q37j6m4/lib/'; # proceed as usual use Convert::IBM390 qw(:all); set_codepage('CP00037'); print "script to import pscmt.txt\n"; my $fileIN = "/sas/rbcpcins/ha/rawdata/pscmt.txt"; my $fileOUT = "/home/q37j6m4/test_out.txt"; my $fileLOG = "/home/q37j6m4/chris_log.txt"; my $recln =98; #length of file being imported open(perlIN,"<",$fileIN) or die "Can't open input.txt: $!"; open(perlOUT, ">",$fileOUT) or die "Can't open output.txt: $!"; open(perlLOG, ">>",$fileLOG) or die "Can't open my.log: $!"; while(<perlIN>) { print $index,"\n"; # pre-define fields my $cltno2 =""; my $cmtseq =""; my $npseqn =""; my $comm =""; my $policy =""; my $pdcode =""; my $userid =""; my $reccdt =""; my $recctm =""; my $cmgrcd =""; my $actcod =""; my $upid =""; # read fields read(perlIN,$cltno2,5); read(perlIN,$cmtseq,5); read(perlIN,$npseqn,3); read(perlIN,$comm,45); read(perlIN,$policy,8); read(perlIN,$pdcode,3); read(perlIN,$userid,10); read(perlIN,$reccdt,5); read(perlIN,$recctm,6); read(perlIN,$cmgrcd,1); read(perlIN,$actcod,3); read(perlIN,$upid,4); # convert fields my $cltno2_c = unpackeb('p5',$cltno2); my $cmtseq_c = unpackeb('p5',$cmtseq); my $npseqn_c = unpackeb('p3',$npseqn); my $comm_c = eb2asc($comm); my $policy_c = eb2asc($policy); my $pdcode_c = eb2asc($pdcode); my $userid_c = eb2asc($userid); my $reccdt_c = unpackeb('p5',$reccdt); my $recctm_c = eb2asc($recctm); my $cmgrcd_c = eb2asc($cmgrcd); my $actcod_c = eb2asc($actcod); my $upid_c = unpackeb('p4',$upid); # write fields print perlOUT sprintf ("%12.0f", $cltno2_c); print perlOUT sprintf ("%12.0f", $cmtseq_c); print perlOUT sprintf ("%12.0f", $npseqn_c); print perlOUT $comm_c; print perlOUT $policy_c; print perlOUT $pdcode_c; print perlOUT $userid_c; print perlOUT sprintf ("%12.0f", $reccdt_c); print perlOUT sprintf ("%12.0f", $recctm_c); print perlOUT $cmgrcd_c; print perlOUT $actcod_c; print perlOUT sprintf ("%12.0f", $upid_c); print perlOUT "\r\n"; } # end while loop close perlIN; close perlOUT; close perlLOG;

Replies are listed 'Best First'.
Re: Looping through a binary file
by runrig (Abbot) on Sep 06, 2013 at 17:04 UTC
    Your 'while' reads in a line of the file, which, being a binary file, is not what you want to do. You can redefine what a 'line' is by setting $/ (see perlvar). If you set it to a reference to an integer (e.g. $/ = \42;), it will read that many bytes. You can set it to the length of your fixed length record. But then you will have to adjust how you set your variables in the following lines (by using substr or unpack or Parse::FixedLength or something) to 'read' from $_ instead of reading from the filehandle again.
Re: Looping through a binary file
by Laurent_R (Canon) on Sep 06, 2013 at 17:46 UTC

    Or use rather the read function which will read the number of bytes you specify. For example:

    open my $fh, "<", $infile or die cannot open $infile $!";; read $fh, $out, 100; # reads 100 bytes from the file

    You can also use the seek function to move down the file.

Re: Looping through a binary file
by marinersk (Priest) on Sep 06, 2013 at 23:06 UTC
    Suspect your problem is that while(<perlIN>) is causing it to read the whole binary file, all at once, right there. Boom -- out of memory.

    Instead, as noted by others, read only what you need -- one record at a time.

    A common approach is to use a flag variable to indicate when you reach end of file; control the loop based on that.

    Two ways to do this:

    1. Read one record at the top of the loop (use read and your $recln value); use substr to extract the pieces, or;
    2. Don't read anything at the top of the loop, but instead, read each variable like you do currently do; modify each one to check for end of file (sample below).

    my $TRUE = 1; my $FALSE = 0; [...] my $EofFlag = $FALSE; while(!$EofFlag) { [...] my $cmtseq_cnt = read(perlIN,$cmtseq,5); if ($cmtseq_cnt < 5) { $EofFlag = $TRUE; # Redundant, given what we do next, but h +ere for example last; } }
Re: Looping through a binary file
by daxim (Curate) on Sep 06, 2013 at 17:07 UTC
    There's a stray $index variable on line 25, but the program basically works for me. To reproduce your error, also provide the input data.
Re: Looping through a binary file
by Marshall (Canon) on Sep 07, 2013 at 10:26 UTC
    I don't work with EBCDIC now, but if you want to, I would suggest that you examine and read: http://perldoc.perl.org/perlebcdic.html very closely.
Re: Looping through a binary file
by aitap (Curate) on Sep 08, 2013 at 19:03 UTC

    while (<filehandle>) is implicitly while (defined ($_ = readline filehandle)) which reads a very long line. You probably wanted to check whether there is still data to read, which is done by the eof function.

    Solution: use something like this: while (!eof(perlIN)).