bh_perl has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I have developed one perl script to read each line of files and normalized the line. If the length of line/records is less than 130, then append "X" character at the rest. I have tested my script on small files..its working but not on larger files ? ... How this happened ... could somebody explained ?.... or do you want to see my script ?

Replies are listed 'Best First'.
Re: Script not working on large file size
by davido (Cardinal) on Apr 22, 2004 at 03:26 UTC
    Without seeing your script, your description of the problem is inadequate for us to provide a diagnosis.

    There is the possibility that you're slurping the file, which doesn't scale well.


    Dave

      Ok... That is my script
      #!/usr/bin/perl -w use Fcntl; use Getopt::Std; use constant DIR => "/d/testing"; use constant MAX => 11; my $prog = $0; $prog =~ s|.*/||; my %opts; getopts('o:', \%opts); my $cmd = $opts{o} || 1; my @date = localtime(time); my $dd = sprintf ("%02d", $date[3]); my $mm = sprintf ("%02d", $date[4] + 1); my $yy = sprintf ("20%02d", $date[5] % 100); my $yymmdd = "$yy$mm$dd"; my @a; my $bnum; my $template = "A12 A4 A20 A20 A2 A2 A2 A2 A12 A4 A24 A24"; unshift (@ARGV, "-") unless @ARGV; for my $mfile (@ARGV) { $file = $mfile; open (IN, "< $file") or die ("Can't open $file: $!\n"); while (<IN>) { @normalDb = (); @errorDb = (); s/\s//g; $dd = $_; my @a = unpack($template, $dd); $sdate = $a[8]; $state = $a[4]; $err = $a[7]; $bnum = substr($a[3], skip_zero($a[3])); substr($dd, 37, 0) = '00' if ( $state eq '02' ); substr($dd, 62, 3) = '01' if ( $sdate !~ /^0/ && $err !~ /^0/ + ); substr($dd, 62, 3) = '01' if (length($dd) == 131); #The 2 last character of lines must be number if ($dd !~ /\d\d$/ || $dd =~ /<u$/ ) { $dd =~ s/..$/00/g; } #insert X for the rest if ( length($dd) < 130 ) { $dd = insert0($dd); } else { $dd = substr($dd, 0, 130); } if ( length($dd) == 130 ) { push (@normalDb, $dd); } else { push (@errorDb, $dd); } if ( @normalDb > 0 ) { my $path = "$file\.ok"; open (NORMAL, "+>> $path") or die $!; for (@normalDb) { print (NORMAL "$_\n") }; close (NORMAL); } if (@errorDb > 0 ) { my $path = "$file\.error"; open(ERROR, "+>> $path") or die $!; for (@errorDb) { print (ERROR "$_\n") }; close (ERROR); } } close (IN); } sub skip_zero { my ($a) = @_; my $j = 0; foreach my $i (split //, $a) { $j++; return $j if ($i > 0); } } sub insert0 { my ($tmp) = @_; my @data; push(@data, $tmp); for (my $i = length($tmp); $i < 130; $i++) { push(@data, "X"); } return sprintf ("%s", join('', @data)); } die;
      As your info, the the file size is about 507156496 bytes....
        ...its working but not on larger files ?

        Nothing stands out as particulalry wrong with your code from a quick browse--except maybe that it is probably quite slow. However, you don't say in what way it is not working on large files? Does it produce error messages? Give the wrong output? Never finish?

        Are you simply not waiting long enough for it to complete?

        Without some hints as to how it's failing, working out why is hard.


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "Think for yourself!" - Abigail
Re: Script not working on large file size
by matija (Priest) on Apr 22, 2004 at 05:49 UTC
    How about:
    while (<>) { # the "x" operator does the right thing for negative valu +es print $_.("X" x (130-length($_))); }
    The only circumstance where Perl would fail on large files I can imagine is if the files are more than 2GB in size, and Perl wasn't compiled with large file support. But this loop should work even there (since it does not need to seek at all).