in reply to Help with calculating stats from a flat-file database

If I understand the description, you want a boolean value for each date+state tuple -- that is: "there (was | was not) at least one tornado on date X in state Y".

If that is the case, think about creating an array of 50 elements (one for each state). The value of each element is a reference to a hash, where the keys of the hash are the dates when one or more tornados occurred in the given state (the value assigned to each element is of no importance: it is the existence of a given date as a hash key for the given state that matters). So, suppose the data is like this:

State Date other stuff 1 03/03/2003 one bad mother of a storm... 5 03/05/2003 nothing to write home about... 8 02/08/2003 they said it could never happen here ...
You might try something like this, just to list how many tornado days there were in each state, based on the given input report:
my @states; # array holding one hash per state while (<>) { my ($state_number,$date) = split; $states[$state_number]{$date} = undef; } # now to summarize: # for each state that had any tornados at all, # list number of tornado days: for ( 1 .. 50 ) { if ( defined $states[$_] ) { printf( "state_id %2d: %3d tornado days\n", $_, scalar keys %{$states[$_]} ); } }
(untested, of course)