Real Perl has asked for the wisdom of the Perl Monks concerning the following question:

Dearest,
I was having a ball with the Win32::DirSize module (that I finally was able to install!) until... My progam takes a list of directories and go get the size of all of them --display each name and size individually in a HList(Tk stuff)-- and add the sizes, so I get a variable with the total amounts of bytes in it. I would like to be able to display the total amount of bytes in a nice way --like with the best_convert() from Win32::DirSize. The problem is that there is not a unique directory to point to and best_convert() requires that. The syntax is:
my $Size = best_convert( my $SizeUnit, $DirInfo->{HighSize}, $DirInfo->{LowSize}, ); print "Dir size = $Size $SizeUnit \n";

Also, if the total of bytes are greater than 3.999GB, Perl cannot handle it (see explanation below).
I have experimented with making my own conversion which works great until the size is more than 3.999GB. The reason is best told by CPAN: "Since drive and directory sizes on Win32 systems can easily reach the multi-terabyte range and beyond, and the result perl can store in a single 32-bit integer is 3.999 GB, it's not possible to return an accurate result in a single variable. So, the Win32 API and this module return the result in two separate values representing the least and most significant 32 bits. This module also provides the result as a string value, suitable for printing and use with Math::BigInt. Be aware that doing any math on the string value will convert it to a floating point value internally and you will lose precision."
That is a great explanation but where are those variables??? And what is the name of the one holding the least significant 32 bits and the one holding the most significant bits? Does anyone know? Also, I would be most greatfull for an even easier solution of a module that could handle the conversion if I passed it the two mysterious variables (the least and the most signitficant 32-bits variables). In my dreams, it would spit out the number and the unit and I could print it!

Here is my code (it generates a table looking HList and fills it with what my user gives me and the sizes that are calculated by Win32::DirSize module.)
sub make_table(){ $table=$page3->Scrolled('HList',-columns=> 2,-header=> 1,-width=>3 +1,-height=>9,-font=>"Times 12",-scrollbars=>'osoe') ->place(-x=> 280, -y=> 136); $table ->headerCreate(0, -text=>" Directory "); $table ->headerCreate(1, -text=>" Size "); open(BATCHIN, "<batch.txt") or die;#open bacth $source =~s/\//\\/g; while(<BATCHIN>){ chomp; @arraybatch = split ' '; #put each line in an array to use whe +n I invoke the size calculation method foreach (@arraybatch){ my $cleansource = $source; $cleansource .= "\\"; $cleanso +urce .= $_; #append the \\ and the user directory $sourceResult = dir_size($cleansource, $sourceDirInfo,);#g +et the size of the directory $sbyte += $sourceDirInfo->{DirSize}; #add the bytes of all + of the directories $sourceSize = best_convert($sourceSizeUnit, $sourceDirInfo +->{HighSize}, $sourceDirInfo->{LowSize},);#converts my $s = $sourceSize;#append the size $s .= $sourceSizeUnit;#appends the unit to the size $table->add($_); $table->itemCreate($_, 0, -text => $_); $table->itemCreate($_, 1, -text => $s); }#foreach }#while }#make_table
I have been experimenting with printing the total in the appropriate unit in a separate file and here is the code
#!/usr/bin/perl -w use strict; use Win32::DirSize; use integer; my $sourcesize="Total: "; my $sbyte=0; my $test = "C:/"; $test=~s/\//\\/g; print $test; my $Result = dir_size( $test, my $DirInfo, # this stores the directory information ); $sbyte = $DirInfo->{DirSize}; print "Dir size = $sbyte bytes\n"; if ($sbyte < 1024){ $sourcesize .= "$sbyte B"; print $sourcesize; } elsif (1024 <= $sbyte && $sbyte < 10240){ $sbyte=$sbyte/1024; $sourcesize .= "$sbyte KB"; print $sourcesize; }elsif (10240< $sbyte && $sbyte < 1073741824){ $sbyte=$sbyte/1048576; $sourcesize .= "$sbyte MB"; print $sourcesize; }else{ print "More than 3.99GB";}#else

The else statement is never reached because of the two independant variables! (I think?)And when the directory is bigger than 3.999GB, the first if statement matches and it prints the all 10 or more digits with B!
Thank you in advance for your insights

Claire

Replies are listed 'Best First'.
Re: Win32::DirSize and GB amount of data
by BrowserUk (Patriarch) on Aug 11, 2005 at 00:16 UTC

    Why are you using use integer? I think that is the source of most of your problems.

    Without that, Perl will transparently convert your integers to reals if they get bigger than 2**32. Using reals, Perl can happily handle integer values upto 2**53 with no loss of accuracy. This equates to 9,007,199,254,740,992 which is 8 petabytes or ~ 8 million Gigabytes.

    So unless you are expecting to be using drives greater than that, forget use integer and all the stuff about "loosing accuracy" and just let Perl take care of it.

    Some code to demonstrate:

    #! perl -slw use strict; use Win32::DirSize; use Data::Dumper; dir_size( 'p:\\', my $info ); print Dumper $info; ## As a string print $info->{DirSize}; ## Same field as an integer print 0+$info->{DirSize}; ## Same value by combining the high and low parts into a float printf "%.f\n", ( $info->{HighSize} * 2**32 ) + $info->{LowSize}; my $dirsize = $info->{DirSize}; if( $dirsize < 2**10 ) { printf "%d bytes\n",$dirsize } elsif( $dirsize < 2**20 ) { printf "%.2f KB\n", $dirsize / 2**10 } elsif( $dirsize < 2**30 ) { printf "%.2f MB\n", $dirsize / 2**20 } elsif( $dirsize < 2**40 ) { printf "%.2f GB\n", $dirsize / 2**30 } elsif( $dirsize < 2**50 ) { printf "%.2f TB\n", $dirsize / 2**40 } elsif( $dirsize < 2**60 ) { printf "%.2f PB\n", $dirsize / 2**50 } else{ print "This number is probably wrong: $dirsize"; } __END__ [ 1:15:03.82] P:\test>DirSize.pl $VAR1 = { 'DirCount' => 3866, 'LowSizeOnDisk' => 140623872, 'HighSizeOnDisk' => 4, 'FileCount' => 67685, 'DirSizeOnDisk' => '17320493056', 'DirSize' => '46480798709', 'LowSize' => 3531125749, 'Errors' => [], 'HighSize' => 10 }; 46480798709 46480798709 46480798709 43.29 GB

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.