This is based on a simple text histogram I added to jcwren's statswhore.pl. Works with an array of integers, but could easily be extended for floating point values. You can set the bin size for dividing up your data. Does some simple scaling to keep things within one screen width.
Creates output something like:
-5 .. -1 [ 1] # 0 .. 4 [ 7] ####### 5 .. 9 [ 6] ###### 10 .. 14 [ 7] ####### 15 .. 19 [ 3] ### 20 .. 24 [ 1] #
Update: Changed name of bins hash to %bin_counts and fixed a bug with width scaling, both pointed out by tilly. Thanks for the feedback tilly!
sub show_histogram { # Prints a simple text histogram given a reference # to an array of integers. # Larry Leszczynski <larryl@furph.com> my ($array_ref, $binsize, $width) = @_; $binsize ||= 1; $width ||= 50; use POSIX qw(ceil floor); # Divide input data into bins: my %bin_count = (); # number of items in each bin foreach ( @$array_ref ) { my $bin = floor(($_+.5)/$binsize); $bin_count{$bin}++; } my $max_items = 0; # maximum items in a single bin foreach ( values %bin_count ) { $max_items = $_ if $_ > $max_items; } # Try to keep histogram on one page width: my $scale = 1; if ( $max_items > $width ) { if ( $max_items <= ($width*5) ) { $scale = 5; } else { while ( ($max_items/$scale) > $width ) { $scale *= 10; } } } my @bins = sort {$a <=> $b} keys %bin_count; my $bin = $bins[0]; # lowest value bin my $maxbin = $bins[-1]; # highest value bin my $binfmt_width = ( length $maxbin > length $bin ) ? length $maxbin : length $bin; my $cntfmt_width = length $max_items; my $start = $bin * $binsize; my $end = $start + $binsize - 1; do { my $count = $bin_count{$bin} || 0; my $extra = ( $count % $scale ) ? '.' : ''; printf "%*d .. %*d \[%*d\] %s$extra\n", $binfmt_width, $start, $binfmt_width, $end, $cntfmt_width, $count, '#' x ceil($count/$scale); $start += $binsize; $end += $binsize; } while ( $bin++ < $maxbin ); print "\n Scale: #=$scale\n" if $scale > 1; }
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Simple Text Histogram
by Anonymous Monk on May 09, 2002 at 18:43 UTC | |
by Anonymous Monk on May 12, 2002 at 01:41 UTC | |
by Anonymous Monk on Jan 22, 2004 at 18:54 UTC | |
by Anonymous Monk on Jan 22, 2004 at 19:18 UTC |