Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Problems with defining hashes

by Annemarie (Acolyte)
on Mar 30, 2005 at 12:46 UTC ( [id://443426]=perlquestion: print w/replies, xml ) Need Help??

Annemarie has asked for the wisdom of the Perl Monks concerning the following question:

Dear perl monks, I am trying to count earthquake magnitudes in time bins which I have numbered, 0, 0.1, 0.3, 1, 3, 10, 30, 1000. Somehow things are not working as I expected. Therefore I have tried to initialize my counting bins. However, I get error messages for the code below. All printed values should come out as 0 but ome values are undefined. What's happening? Thanks for your help, Annemarie
for ($mag=5.0;$mag<9.0;$mag=$mag+0.1){ $n{$mag}{0}=$n{$mag}{0.1}=$n{$mag}{0.3}=$n{$mag}{1}=0; $n{$mag}{3}=$n{$mag}{10}=$n{$mag}{30}=$n{$mag}{1000}=0; } for ($mag=5.5; $mag<9.0; $mag=$mag+0.1){ print"$n{$mag}{30}\n"; }
Thanks for all responses! I am learning so much from seeing how other people tackle things!

Replies are listed 'Best First'.
Re: Problems with defining hashes
by polettix (Vicar) on Mar 30, 2005 at 13:00 UTC
    You're using floating point numbers as keys in the hash and this is probably something that you shouldn't do, unless you exactly know what you're doing.

    Consider that 5.0, 5.5 and 0.1 have not an exact representation inside the computer, due to its binary nature and finite bits per number; given the fact that your cycles start from different starting points (5.0 the first, 5.5 the second), you end up having slightly different number in $mag in the first and second loop, i.e. different keys in the hash.

    I would suggest that you fix the magnitudes somewhere (an array, for example), then use some integer index just to be sure not to mess things up.

    Flavio

    Don't fool yourself.
      5.0 and 5.5 do have exact representations; 0.1 does not.
        Touché :-) Another case of "negative laziness"...

        Flavio

        Don't fool yourself.
      Thanks, Falvio. I had completely forgotten about the issues with using floating point numbers.
Re: Problems with defining hashes
by Fletch (Bishop) on Mar 30, 2005 at 13:08 UTC

    Seconded on the using floats as hash keys being a bad idea. Run this version to see why:

    my @keys = qw( 0 0.1 0.3 1 3 10 30 1000 ); my %n; for ( my $mag = 5.0 ; $mag < 9.0 ; $mag += 0.1 ) { @{ $n{ $mag } }{ @keys } = ( 0 ) x @keys; } for ( sort { $a <=> $b } keys %n ) { print "mag: $_\t$n{$_}->{30}\n"; }

    Perhaps if you step back and described what you're trying to accomplish with this hash . . .

Re: Problems with defining hashes
by tlm (Prior) on Mar 30, 2005 at 13:47 UTC

    To expand a bit on frodo72's reply, if you change your code slightly to:

    for ($mag=5.0;$mag<9.0;$mag=$mag+0.1){ $n{$mag}{0}=$n{$mag}{0.1}=$n{$mag}{0.3}=$n{$mag}{1}=0; $n{$mag}{3}=$n{$mag}{10}=$n{$mag}{30}=$n{$mag}{1000}=0; } for ($mag=5.5; $mag<9.0; $mag=$mag+0.1){ print"$mag $n{$mag}{30}\n"; }
    you'll see that you are not initializing what you think you are. The output of the above looks like this:
    5.5 0 5.6 0 5.7 0 5.8 0 5.9 0 6 0 6.1 0 6.2 0 6.3 0 6.4 0 6.5 6.6 6.7 6.8 6.9 6.99999999999999 0 7.09999999999999 0 7.19999999999999 0 7.29999999999999 0 7.39999999999999 0 <truncated>
    Try this:
    use strict; my @bin_labels = qw( 0 0.1 0.3 1 3 10 30 1000 ); { my $step_size = 0.1; my $start = 5.0; my $finish = 9.0; sub index_to_mag { my $i = shift; return $start + $i * $step_size; } sub mag_to_index { my $mag = shift; return sprintf "%.f", ($mag - $start) / $step_size; } } my @n; for ( my $i = 0; $i < mag_to_index( 9.0 ); ++$i ) { $n[ $i ]{ $_ } = 0 for @bin_labels; } for ( my $mag = 5.5; $mag < 9.0; $mag = sprintf "%.1f", $mag + 0.1 ) { printf "%.1f %f\n", $mag, $n[ mag_to_index( $mag ) ]{ 30 }; }

    Afterthought: Maybe some clarification will make the code above more useful. I switched using a hash (%n) to using an array (@n), for the reasons that frodo72 already explained. In this case, there is a simple way to interconvert between array indexes and magnitudes; this is encapsulated in index_to_mag and mag_to_index. The frequent use of (s)printf takes care of avoiding round-off errors.

    Update: Made proper closures out of mag_to_index and index_to_mag.

    the lowliest monk

Re: Problems with defining hashes
by graff (Chancellor) on Mar 30, 2005 at 15:36 UTC
    This might be an easier way to avoid the floating-point problem for you hash keys:
    # first, generate fixed-precision strings for magnitudes # and bin values: my @mag_keys = map { sprintf("%.1f", $_/10) } ( 50 .. 89 ); my @bins = qw/0 0.1 0.3 1 3 10 30 1000/; # now initialize hash bins for counting: # (this is probably unnecessary, unless you're re-using # the hash on multiple separate data sets) my %n; for my $mag ( @mag_keys ) { for my $bin ( @bins ) { $n{$mag}{$bin} = 0; } }
    The next thing to watch out for, when actually counting things up, is to avoid using the "==" operator to test whether a floating point value from your input data matches a given hash key. Use only "<", ">", "<=", ">=" as needed, or else use sprintf on the data value to get it into the same precision as the hash key, then use "eq" (or "gt", "ge", "le", "lt").

    I'm still scratching my head about the "0, 0.1, 0.3, ..." series -- that jump from 30 to 1000 seems odd.

      As I just said to Flavio, I had completely forgotten about issues with floating point numbers. Your suggestions work beautifully and are brief. Thank you! The values 0.1 to 30 cover logarithmic time intervals during which I assume my data to be complete. 1000 is an arbitrary default value for aftershocks outside my time intervals. I could have used any other name, maybe 'outside'. The '0' collects foreshock data. I hope the explanation saves your head from being scratched. ;-)
Re: Problems with defining hashes
by melora (Scribe) on Mar 30, 2005 at 23:35 UTC
    What about using the initialization as written and then using a foreach loop to work with the contents? Just a thought; it would depend on how you need to use the hash.
Re: Problems with defining hashes
by tphyahoo (Vicar) on Mar 31, 2005 at 14:20 UTC
    For those who, like me, haven't learned sprintf yet, here's a nice explanation/tutorial on sprintf.

    What got me, specifically, was %.1f. I thought the 1 (one) was an l (ell). Dur...

    **********

    Those new to the x operator (aka repetition operator) may want to have a look at this explanation of the x operator.

    20050404 Unconsidered by Corion. Was considered by holli: clean up link (edit:11 keep:7 del:0)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://443426]
Approved by RazorbladeBidet
Front-paged by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (6)
As of 2024-04-26 08:13 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found