Re: Working out frequency statistics with perl

You aren't showing the whole script, so I wonder... do you have these lines anywhere in your code (near the top)?

use strict;
use warnings;
[download]

Apart from that, I wonder if your "readmap" command ever outputs any lines with initial whitespace. If so, your split statement should be:

    @fields = spit " ";
[download]

because that will differ from /\s+/ -- try this snippet to see the difference:

$_ = "   begins with whitespace";

@a = split " ";
@b = split /\s+/;

print "quoted-space split returns ", scalar @a, " elements\n";
print "\\s+ split returns ", scalar @b, ", first one has length ",leng
+th($b[0]),"\n";
[download]

update: To clarify the issue, using "\s+" for splitting will get you into real trouble if the "readmap" output varies in terms of presence/absence of whitespace at the beginning of each line. If the output is consistent in this regard, then using "\s+" is probably not a problem -- you just need to make sure you've counted the field indexes correctly, so that $fields[2] etc really point at what you want them to point at.

Comment on Re: Working out frequency statistics with perl Select or Download Code

Replies are listed 'Best First'.
Re^2: Working out frequency statistics with perl by wishartz (Beadle) on Jul 16, 2008 at 10:43 UTC
Hello again Monks, thanks for everybody's feedback. To answer the last reply I got, I am using warnings and strict and I am not getting any errors. The output is always consistant as well. Here is a sample of the exact output, that from the readmap command if it matches the pattern StageTime. Read more... (22 kB) I don't understand the subroutine that was posted earlier. I don't understand what I am supposed to pass to items? Am I supposed to pass the sorted array, that is only the keys not the values? `sub build_histogram { my ($bucket_size, @items) = @_; my %result; for (@items) { my $bucket = $bucket_size * floor($_ / $bucket_size); print "$bucket\n"; $result{$bucket}++; } return %result; }` [download]	[reply] [d/l] [select]
Re^3: Working out frequency statistics with perl by moritz (Cardinal) on Jul 16, 2008 at 11:10 UTC
I don't understand the subroutine that was posted earlier. I don't understand what I am supposed to pass to items? A list of numbers from which you want to build your histogram. And it doesn't have to be sorted. Your sample data generates this histogram for me: `0: 67 30: 73 60: 93 90: 75 120: 26 150: 18 180: 10 210: 5 240: 2 270: 1 300: 1 330: 2 360: 1 390: 1 450: 1 840: 1` [download]	[reply] [d/l]
Re^4: Working out frequency statistics with perl by wishartz (Beadle) on Jul 16, 2008 at 12:48 UTC
Please excuse my ignorance, but I'm not sure where to call the subroutine build histogram, and how to print out the returned result hash. I think I'm going about this in the wrong way. I tried this: open(READMAP, "$command \|") \|\| error_exit("Cannot run readmap, $!"); +#run the readmap command while(<READMAP>) { #loop through the output of the readmap command if (/StageTime/){ @fields = split/\s+/; + #Split the output by white space chop $fields[9]; + # Remove the period after the last digit $diskxStats[$i]{'filesize'}=$fields[2]; $diskxStats[$i]{'ftptime'}=$fields[5]; $diskxStats[$i]{'stagetime'}=$fields[9]; if ( $fields[5] != 0 ){ $diskxStats[$i]{'transferRate'} = $fields[2] / $fields[5]; } else{ $diskxStats[$i]{'transferRate'} = $fields[2]; } if ( $fields[9] != 0 ){ $diskxStats[$i]{'stagerate'} = $fields[2] / $fields[9]; } else{ $diskxStats[$i]{'stagerate'} = $fields[2]; } $i++; } } for $i (0 .. $#diskxStats ) { build_histogram ($bin_stage, $diskxStats[$i]{'stagetime'}); } sub build_histogram { my ($bucket_size, @items) = @_; my %result; for (@items) { my $bucket = $bucket_size * floor($_ / $bucket_size); # print "$bucket\n"; $result{$bucket}++; } return %result; } [download] Which is obviously incorrect, but not sure what I need to change to get the results you got?	[reply] [d/l]
Re^5: Working out frequency statistics with perl by jethro (Monsignor) on Jul 16, 2008 at 14:26 UTC