in reply to Working out frequency statistics with perl

You aren't showing the whole script, so I wonder... do you have these lines anywhere in your code (near the top)?
use strict; use warnings;
Apart from that, I wonder if your "readmap" command ever outputs any lines with initial whitespace. If so, your split statement should be:
@fields = spit " ";
because that will differ from /\s+/ -- try this snippet to see the difference:
$_ = " begins with whitespace"; @a = split " "; @b = split /\s+/; print "quoted-space split returns ", scalar @a, " elements\n"; print "\\s+ split returns ", scalar @b, ", first one has length ",leng +th($b[0]),"\n";
update: To clarify the issue, using "\s+" for splitting will get you into real trouble if the "readmap" output varies in terms of presence/absence of whitespace at the beginning of each line. If the output is consistent in this regard, then using "\s+" is probably not a problem -- you just need to make sure you've counted the field indexes correctly, so that $fields[2] etc really point at what you want them to point at.

Replies are listed 'Best First'.
Re^2: Working out frequency statistics with perl
by wishartz (Beadle) on Jul 16, 2008 at 10:43 UTC
    Hello again Monks, thanks for everybody's feedback. To answer the last reply I got, I am using warnings and strict and I am not getting any errors. The output is always consistant as well. Here is a sample of the exact output, that from the readmap command if it matches the pattern StageTime.

    I don't understand the subroutine that was posted earlier. I don't understand what I am supposed to pass to items? Am I supposed to pass the sorted array, that is only the keys not the values?
    sub build_histogram { my ($bucket_size, @items) = @_; my %result; for (@items) { my $bucket = $bucket_size * floor($_ / $bucket_size); print "$bucket\n"; $result{$bucket}++; } return %result; }
      I don't understand the subroutine that was posted earlier. I don't understand what I am supposed to pass to items?

      A list of numbers from which you want to build your histogram. And it doesn't have to be sorted.

      Your sample data generates this histogram for me:

      0: 67 30: 73 60: 93 90: 75 120: 26 150: 18 180: 10 210: 5 240: 2 270: 1 300: 1 330: 2 360: 1 390: 1 450: 1 840: 1
        Please excuse my ignorance, but I'm not sure where to call the subroutine build histogram, and how to print out the returned result hash. I think I'm going about this in the wrong way. I tried this:
        open(READMAP, "$command |") || error_exit("Cannot run readmap, $!"); +#run the readmap command while(<READMAP>) { #loop through the output of the readmap command if (/StageTime/){ @fields = split/\s+/; + #Split the output by white space chop $fields[9]; + # Remove the period after the last digit $diskxStats[$i]{'filesize'}=$fields[2]; $diskxStats[$i]{'ftptime'}=$fields[5]; $diskxStats[$i]{'stagetime'}=$fields[9]; if ( $fields[5] != 0 ){ $diskxStats[$i]{'transferRate'} = $fields[2] / $fields[5]; } else{ $diskxStats[$i]{'transferRate'} = $fields[2]; } if ( $fields[9] != 0 ){ $diskxStats[$i]{'stagerate'} = $fields[2] / $fields[9]; } else{ $diskxStats[$i]{'stagerate'} = $fields[2]; } $i++; } } for $i (0 .. $#diskxStats ) { build_histogram ($bin_stage, $diskxStats[$i]{'stagetime'}); } sub build_histogram { my ($bucket_size, @items) = @_; my %result; for (@items) { my $bucket = $bucket_size * floor($_ / $bucket_size); # print "$bucket\n"; $result{$bucket}++; } return %result; }
        Which is obviously incorrect, but not sure what I need to change to get the results you got?