So, you're expecting a user to come up with a set of threshold values to put on the command line? How likely is it, really, that a user will want to try a bunch of different variations of threshold values? (In fact, how likely is it that a user already knows what ranges of values are going to be useful?)

Any a priori assumptions that might provide sensible aid to reduce the user's "cognitive load" would be worth building into the script -- e.g. maybe threshold values should always be evenly spaced over an appropriate range, and users would just say how many thresholds (histogram bins) they want on a given run.

Regarding the code you posted, I'd offer a few "stylistic" points:

use Getopt::Long; # or Getopt::Std, which might be easier to grok.
That will make it easy to offer useful default values for things like number of bins, start-time and end-time. There could even be a default value for the name of the log file to read.

Perl gives a warning about line 59 -- it's harmless, but worth fixing.

When there's an "if" block that always ends with "exit 1" (which should just be "die"), there's no need for an "else" block after that (you can eliminate a layer of embedding). Likewise, you don't need an "else" block that contains just a next statement, given that there's nothing after that block in the enclosing loop.

Assuming you have an array of threshold values, you just need to make sure the array values are sorted, and loop over them to work out which bin a given value should be counted in -- here's a simple example that leaves aside all your other issues about selecting/excluding log entries:

my @thresh = ( 1000, 4000, 7000, 10000 ); my @bins; while (<LOG>) { my $val = ( split )[10]; next unless ( $val =~ /^\d+$/ ); my $i; for $i ( 0 .. $#thresh ) { last if ( $val < $thresh[$i] ); } $bins[$i]++; }
(UPDATED to give appropriate scope to $i -- thanks to wfsp for pointing that out.)

Geez! As GrandFather points out below, I really didn't get that right. Even after wfsp had told me it wouldn't work, I still had it wrong. What I should have suggested was something like this (thanks, GrandFather):

my @thresh = ( 1000, 4000, 7000, 10000 ); my @bins; while (<LOG>) { my $val = ( split )[10]; next unless ( $val =~ /^\d+$/ ); my $i = 0; while ( $i < @thresh and $val > $thresh[$i] ) { $i++; } $bins[$i]++; }

In reply to Re: Any easy way to do this? by graff
in thread Any easy way to do this? by jb60606

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.