punch_card_don has asked for the wisdom of the Perl Monks concerning the following question:

Molten Monks,

Why is my code assigning the last-retrieved value to all elements in my hash?

I have a DB table, "booklist_table" that holds information about books to be published. As the book evolves, it can change title, and each time it does a new entry is made in the table with a unique update_id. The object of this script is to pull out all the LATEST titles of all books in the table and tell me the update_id's of those records. We start with a hash of book IDs, %booklist_1, where the keys are the IDs.

$sql = "select update_id, title from booklist_table where (book_id = ? +) order by update_id desc limit 1"; $sth = $dbh->prepare($sql) or die("Could not prepare!" . $dbh->errstr) +; foreach $book_id (keys %booklist_1) { $sth->execute($book_id) or die("Could not execute!" . $dbh->errstr +); ($update_id, $title) = $sth->fetchrow_array(); $booklist_1{$book_id}{'update_id'} = $update_id; $booklist_1{$book_id}{'title'} = $title; print "$book_id > $booklist_1{$book_id}{'update_id'}\n"; } $sth->finish; foreach $book_id (keys %booklist_1) { print "XXX $book_id > $booklist_1{$book_id}{'update_id'}\n"; }
The second loop is a temporary debugging tool to see how the booklist_1 hash was populated. The script gives me this output:
100 > 456 101 > 753 102 > 489 103 > 259 XXX 100 > 259 XXX 101 > 259 XXX 102 > 259 XXX 103 > 259
That is, every $booklist_1{$book_id}{'update_id'} is being set to the last retrieved update_id.

What the heck?

Thanks.




Time flies like an arrow. Fruit flies like a banana.

Replies are listed 'Best First'.
Re: Why is my code assigning the last-retrieved value to all elements in my hash?
by kyle (Abbot) on Jul 09, 2008 at 16:42 UTC

    My guess is there's a reference somewhere you're not expecting. For example, if %booklist_l is initialized this way:

    my %booklist_l; @booklist_l{ @book_id_list } = ( { title => 'empty', update_id => 'missing' } ) x scalar @book_id_list;

    ...then every item in %booklist_l has the same reference to an "empty" record.

    Also, DBI, when it returns things from fetchrow_hashref, always returns the same reference (as a performance optimization). As such, if you do this:

    $booklist_l{ $book_id } = $sth->fetchrow_hashref;

    ...again, you'll have the same reference everywhere. In that case, you can get a copy of what it returns like this:

    $booklist_l{ $book_id } = { %{$sth->fetchrow_hashref} };

    That doesn't look like what you're doing, but I can't run your example code, so I'm suspicious.

    Update: I think I didn't explain my suspicion very well. Each $booklist_l{ $book_id } is a reference to a hash. I suspect they're all the same reference. How that happened, I'm not sure, but I give a couple of possibilities above. An easy way to check whether they're all the same reference is to stringify them. Your ending debugging loop could be:

    foreach $book_id (keys %booklist_1) { my $r = $booklist_l{$book_id}; print "XXX $book_id > $r->{'update_id'} -- from $r\n"; }
      Hmmm...several fingers pointed at referencing the hash so far in this thread...

      The %booklist_1 hash is populated from a previous query, like this:

      $sql = "SELECT book_id FROM booklist_table WHERE (expected_publcn_date + >= '".$start_date."' AND expected_publcn_date <= '".$end_date."')"; $sth = $dbh->prepare($sql) or die("Could not prepare!" . $dbh->errstr) +; $sth->execute() or die("Could not execute!" . $dbh->errstr); while ($book_id = $sth->fetchrow_array()) { $booklist_1{$book_id} = 1; } $sth->finish;



      Time flies like an arrow. Fruit flies like a banana.

        Use strict and warnings!!!

        use Data::Dumper; my %booklist_l; $booklist_l{ 'foo' } = 1; $booklist_l{ 'bar' } = 1; $booklist_l{ 'foo' }{ 'update_id' } = 'update_id 1'; $booklist_l{ 'foo' }{ 'title' } = 'title 1'; $booklist_l{ 'bar' }{ 'update_id' } = 'update_id 2'; $booklist_l{ 'bar' }{ 'title' } = 'title 2'; print Dumper \%booklist_l; print Dumper \%1; __END__ $VAR1 = { 'bar' => 1, 'foo' => 1 }; $VAR1 = { 'update_id' => 'update_id 2', 'title' => 'title 2' };

        That number '1' you put into every entry is being treated as a symbolic reference to the variable %1.

        I think the bug is here. Try to write something like
        while ( ($book_id) = $sth->fetchrow_array() ) { $booklist_1{$book_id} = 1; }
        (note the extra parenthesis in order to enforce list context)

        or

        while ( $book_id = $sth->fetchrow_arrayref() ) { $booklist_1{ $book_id->[0] } = 1; }

        Code is not tested

        Best regards

        UPDATE:. I have overseen the kyle suggestion. It seems that the right code regarding the second loop will be:

        while ( ($book_id) = $sth->fetchrow_array() ) { $booklist_1{$book_id} = {}; }
Re: Why is my code assigning the last-retrieved value to all elements in my hash?
by NetWallah (Canon) on Jul 09, 2008 at 17:22 UTC
    I think it depends on how you initialize %booklist_1.

    In the code below, Init_1 seems to reproduce your problem, while Init_2 provides an alternative, properly functioning initialization.

    I remember running into this earlier, and posting a very similar issue in Conquering "x" operator over-optimization. merlyn gently indicated that a "deep copy" is required in this case.

    use strict; use warnings; my ($book_id , %booklist_1); Init_1(); populate_and_print(); Init_2(); populate_and_print(); #-------------------- sub populate_and_print{ foreach $book_id (keys %booklist_1) { my ($update_id, $title) = ( int (100 *rand()), "title" . int ( +1000 *rand())); $booklist_1{$book_id}{'update_id'} = $update_id; $booklist_1{$book_id}{'title'} = $title; print "$book_id > $booklist_1{$book_id}{'update_id'}\n"; } #$sth->finish; foreach $book_id (keys %booklist_1) { print "XXX $book_id > $booklist_1{$book_id}{'update_id'}\n"; } } sub Init_1{ %booklist_1=(); @booklist_1{ 100..104 } = ( { title => 'empty', update_id => 'missing' } ) x 5; } sub Init_2{ %booklist_1=(); $booklist_1{$_ } = { title => 'empty', update_id => 'missing' } for 100..104; }

         Have you been high today? I see the nuns are gay! My brother yelled to me...I love you inside Ed - Benny Lava, by Buffalax

Re: Why is my code assigning the last-retrieved value to all elements in my hash?
by jethro (Monsignor) on Jul 09, 2008 at 16:58 UTC
    Use Data::Dumper in the loop to see what $booklist_1 really contains and print out update_id and title

    I deleted your database code and substituted $sth->fetchrow() with (rand(),rand()). My output was ok, no repetition of the last element.

Re: Why is my code assigning the last-retrieved value to all elements in my hash?
by nikool (Novice) on Jul 09, 2008 at 16:47 UTC
    Every $booklist_1{$book_id}{'update_id'} is being set to the first record of your SQL Query. You should change your loop for something like : while (@data = $sth->fetchrow_array()) { ... }

      The OP's query returns only one record (it includes 'LIMIT 1'). On each loop, he executes the query for the new $book_id and then fetches that one record.

Re: Why is my code assigning the last-retrieved value to all elements in my hash?
by pc88mxer (Vicar) on Jul 09, 2008 at 20:21 UTC
    Here is a wild guess. Is it possible that $booklist_1{$id} are all references to the same hash?

    How are you initially populating %$booklist_1?

    To test this, run this code:

    for $book_id (keys %$booklist_1) { print $booklist_1{$book_id}, "\n"; }
    You will get output like HASH(0x12345678). If you get the same line repeated, then all of your sub-hashes are in fact the same hash object.

    I could duplicate your output when I pre-initialized all of the %$booklist_1 hashes with the same hash.

Re: Why is my code assigning the last-retrieved value to all elements in my hash?
by TJRandall (Sexton) on Jul 10, 2008 at 00:51 UTC
    Just a suggestion - use Data::Dumper for printing out the entire hash, rather than looping through. [ http://search.cpan.org/~ilyam/Data-Dumper-2.121/Dumper.pm [ So maybe I'm missing something - where are you incrementing $book_id?
Re: Why is my code assigning the last-retrieved value to all elements in my hash?
by punch_card_don (Curate) on Jul 10, 2008 at 12:51 UTC
    Soooooo....Many thanks to Kyle and others for sticking with this even when I got crabby yesterday.

    Really the central problem was extending a one-dimensional hash to a two-dimensional one and then expecting the first-dimension's keys to remain scalars. Having extended the original hash into a hash of hashes, clearly the first-dimension's keys had to become hash-refs.

    With all due respect to Kyle, I don't believe it has anything at all to do with using "1".

    The booklist_1 hash populating loop

    while ($book_id = $sth->fetchrow_array()) { $booklist_1{$book_id} = 1; }
    works just fine. There'd be something seriously fubar if you couldn't set the value of a variable to 1. At this point, keys %booklist_1 gives me the list of book_id's.

    Where it gets messed up is in the second loop where I extend the hash

    $sql = "select update_id, title from booklist_table where (book_id = ? +) order by update_id desc limit 1"; $sth = $dbh->prepare($sql) or die("Could not prepare!" . $dbh->errstr) +; foreach $book_id (keys %booklist_1) { $sth->execute($book_id) or die("Could not execute!" . $dbh->errstr +); ($update_id, $title) = $sth->fetchrow_array(); $booklist_1{$book_id}{'update_id'} = $update_id; $booklist_1{$book_id}{'title'} = $title; print "$book_id > $booklist_1{$book_id}{'update_id'}\n"; } $sth->finish;
    When I extend it to two dimensions, it becomes a hash of hashes and the first dimension's keys must become references to the second dimenion hashes. So now keys %booklist_1 is a bunch of hash references, and no longer the bunch of book_ids I set them to originally. Kyle set me onto this when he wrote

    You can't have a $b{foo}{bar} and have $b{foo} set to some unrelated value (such as 1). If there is a $b{foo}{bar}, then $b{foo} must somehow be a reference to a hash where you access ->{bar}.

    My final solution is to use distinct hashes, like this:

    $sql = "SELECT book_id FROM booklist_table WHERE (expected_publcn_date + >= '".$start_date."' AND expected_publcn_date <= '".$end_date."')"; $sth = $dbh->prepare($sql) or die("Could not prepare!" . $dbh->errstr) +; $sth->execute() or die("Could not execute!" . $dbh->errstr); while ($book_id = $sth->fetchrow_array()) { $booklist_1_temp{$book_id} = 1; } $sth->finish; $sql = "select update_id, title from booklist_table where (book_id = ? +) order by update_id desc limit 1"; $sth = $dbh->prepare($sql) or die("Could not prepare!" . $dbh->errstr) +; foreach $book_id (keys %booklist_1_temp) { $sth->execute($book_id) or die("Could not execute!" . $dbh->errstr +); ($update_id, $title) = $sth->fetchrow_array(); $booklist_1{$book_id}{'update_id'} = $update_id; $booklist_1{$book_id}{'title'} = $title; print "$book_id > $booklist_1{$book_id}{'update_id'}\n"; } $sth->finish;
    Note the "_temp" added to the booklist_1 hash name in the first loop. Which works like a charm.




    Time flies like an arrow. Fruit flies like a banana.
      With all due respect to Kyle, I don't believe it has anything at all to do with using "1".

      I think you're still misunderstanding what kyle said.

      Indeed, the problem isn't that you were using "1" specifically -- but kyle didn't say that. He said that the problem is that you were using a value that wasn't a hashref and then later trying to treat it as one (and the value you were using that wasn't a hashref happened to be "1").

      When I extend it to two dimensions, it becomes a hash of hashes and the first dimension's keys must become references to the second dimenion hashes. So now keys %booklist_1 is a bunch of hash references, and no longer the bunch of book_ids I set them to originally.

      No. The keys don't change. I believe the keys can never be anything other than strings. It's the values that become hash references in a hash of hashes.

      I commend Data::Dumper to you -- it really is a good way of visualising what's going on underneath your data structures. And good luck -- I think even the most wizened master here will agree that the easiest thing to do with Perl data structures is to mess them up. :)

      With all due respect to Kyle, I don't believe it has anything at all to do with using "1".

      It's not "1" specifically that's the problem. The problem is using any particular static string for every book ID. You had the same problem using "ok" instead of "1". You'd have the same problem using "this_will_work_for_sure" or a million other things. It's also a problem you wouldn't have if you'd use strict because it won't allow you to use a string as a reference.

      Really the central problem was extending a one-dimensional hash to a two-dimensional one and then expecting the first-dimension's keys to remain scalars. Having extended the original hash into a hash of hashes, clearly the first-dimension's keys had to become hash-refs.

      ...

      When I extend it to two dimensions, it becomes a hash of hashes and the first dimension's keys must become references to the second dimenion hashes. So now keys %booklist_1 is a bunch of hash references, and no longer the bunch of book_ids I set them to originally.

      (Emphasis was added by me.)

      Just a nit-pick that may prove useful in the future: hashes are made of key/value pairs, in which the key is something closer to a string than to a scalar, and the value is a scalar. In particular, the value can be a reference to a hash.

      Thus, there's no way to put a reference in the key value of a regular* Perl hash. The most you can have is that the "stringification" of a reference can be used as a hash key, but this is something different: you can't use the hash key to directly* access the reference any more using some obscure dark magic.

      * I'm using the words regular and directly because I'm sure that there will be someone pointing out that there are modules/ways to have a magical hash that has whatever key you want, and that you can access a reference even if all you have is a stringification! :D

      perl -ple'$_=reverse' <<<ti.xittelop@oivalf

      Io ho capito... ma tu che hai detto?
Re: Why is my code assigning the last-retrieved value to all elements in my hash?
by Anonymous Monk on Jul 10, 2008 at 00:47 UTC