FireyIce01 has asked for the wisdom of the Perl Monks concerning the following question:

Okay, found an issue.... we have several large files that we'll be looking at... 30+ gigs... Oh, guess I should show you what I'm doing... Basically I'm doing a search of several specific folders for files based on corp number, and I've got the script working (thanks to a lot of help from this site ;o) but I noticed that it doesn't handle large files well... here's my whole script, with notes below.
#!/usr/bin/perl use strict; use File::stat; use Date::Manip; # This is a program that is intended to allow FEP operators # a tool that will facilitate in monitoring incoming transmissions # and preprocessor logs. #------------------- Define Subroutines ------------------# # # # Main is a subroutine simply written for flow control sub Main { my $corp = $_[0]; print "Input:\n"; farmInput($corp,'input'); print "Save:\n"; farmInput($corp, 'input/save'); print "Failed:\n"; farmInput($corp, 'input/failed'); print "Log:\n"; farmInput($corp, 'log'); } # end Main # This subroutine takes 2 input variables, the corp (or any text you w +ant to find) # and the path sub farmInput { my($corp, $finalPath) = ($_[0], $_[1]); my $path = "/usr/local/farm/*/*/$finalPath/*$corp*"; my @farms = glob($path); foreach my $farm (@farms) { my $s = stat($farm); printf "%s\nAge[%s] size[%d]\n", $farm, ParseDateString("epoch".$s +->mtime()), $s->size(); # the above line does the same thing as the next 4 commented lines # printf "%s: Age [%s], size[%s]\n", # $farm, # DateCalc( "Jan 1, 1970 00:00:00 GMT",$s->mtime() ), # $s->size(); } # end of foreach statement } # end farmInput #------------------- End Subroutines ---------------------# print "What corp are you looking for? " ; chomp(my $whatWeWant = <STDIN>); Main($whatWeWant);
The problem is encountered here:
</readmore> <p> using my perl script I get: </p> <code> /usr/local/farm/vstrm/vsafp/input/IVS86095.01172005.C01.ITDVS.STMTS.10 +0105.DAT01 Age[2005011723:54:53] size[-1]

when ls -ltr gives me:

-rw-rw-r-- 1 cdusr001 prod 29116399616 Jan 17 23:55 IVS86095.011 +72005.C01.ITDVS.STMTS.100105.DAT01

Notice, the file is also still coming in... I don't know if that has anything to do with the issue, but this is usually a 32 gig file... soo...

How would I go about writing this so that it shows the filesize properly?

Readmore tags added by davido per consideration vote of 3/35/0.

Replies are listed 'Best First'.
Re: Large Files and glob/stat
by BrowserUk (Patriarch) on Jan 18, 2005 at 09:12 UTC

    You need to check if your perl is built with large file support. If you do a perl -V and look for the line that read something like:

    Characteristics of this binary (from libperl): Compile-time options: MULTIPLICITY USE_ITHREADS USE_LARGE_FILES ^^^^^^^^^^^^^^^ +PERL_IMPLICIT_CONTEXT PERL_IMPLICIT_SYS

    Examine what is said, not who speaks.
    Silence betokens consent.
    Love the truth but pardon error.
Re: Large Files and glob/stat
by Zaxo (Archbishop) on Jan 18, 2005 at 08:59 UTC

    Your file size is a somewhat larger integer than the usual 32-bit integer can hold. Perl can actually go a few bits larger using the wider float type, but I don't think that's quite enough.

    The problem might extend to the C library's stat() call.

    Try running you program on a perl built for 64-bit integers (which can be done on any platform). If it's still broken, look to the C libs.

    If the size reporting is not that critical, you can just say something like,  printf 'Size: %s', $size < 0 ? '> 2 GB' : $size;</code>

    After Compline,
    Zaxo

Re: Large Files and glob/stat
by chb (Deacon) on Jan 18, 2005 at 08:40 UTC
    You are trying to use the thing returned by stat as an object, but perldoc -f stat says it is merely a list containg the filesize on index seven. Try my $s = (stat($farm))[7]. Or, better yet, check the return value of stat first...
    Or try the -s operator (man perlfunc).
      Well, with files under 2 gig it gives me accurate bytecount... if you notice, I use use File::stat; which allows me to referecne by name and printf "%s\nAge[%s] size[%d]\n", $farm, ParseDateString("epoch".$s->mtime()), $s->size(); is referenceing that element of the list by name... the problem only seems to occur on a file that's larger than 2 gigs...

      scratch that changed     printf "%s\nAge[%s] size[%d]\n", $farm, ParseDateString("epoch".$s->mtime()), $s->size(); to say     printf "%s\nAge[%s] size[%s]\n", $farm, ParseDateString("epoch".$s->mtime()), $s->size(); and it started working... printf was looking for an integer for the filesize... and when the number it was getting was too big it just turned it into a -1... all fixed, I just turned it into a string since I'm not actually doing any math functions on it.

        oops, should read the code more careful...