Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

using references as keys in a hash.

by habit_forming (Monk)
on Feb 23, 2003 at 02:01 UTC ( [id://237836]=perlquestion: print w/replies, xml ) Need Help??

habit_forming has asked for the wisdom of the Perl Monks concerning the following question:

It seems as though perl cannot use a stringified reference to something as a key for a hash. Below is a quick example of this. What am I doing wrong? And if I am not doing anything wrong... why does perl do what it does?
Example:
%hash = ([1,2] => "STUFF"); foreach $key (keys (%hash) ) { print "key |$key| => value |$hash{$key}| :". " dereferenced key |@{$key}|\n"; }
This prints out:
key |ARRAY(0x80fbb0c)| => value |STUFF| : derefenced key ||
In fact "ref()" does not even see the key in that hash as a reference any longer.

So I have two questions:
1. Was this behavior planned?
2. If so, to what benefit?


--habit

Replies are listed 'Best First'.
Re: using references as keys in a hash.
by pfaut (Priest) on Feb 23, 2003 at 02:11 UTC

    Perl can use the stringified reference as a key in a hash but that's not the same thing as using the reference itself. The reference is converted to a string to use as a key. The resulting key is just a scalar.

    You might want to look into Tie::RefHash.

    --- print map { my ($m)=1<<hex($_)&11?' ':''; $m.=substr('AHJPacehklnorstu',hex($_),1) } split //,'2fde0abe76c36c914586c';
      Thank you for your reply. I understand that what perl is doing but the fact I cannot get the data back from the stringified reference is a bit... well... annoying.

        It is a tradeoff for speed. Defining the standard hash such that each key is forced to be a string allows the lookups to dereference keys to be very fast indeed.

        Since the requirement for objects (ie real, unstringified references) as keys seems to be rare, it makes sense to gain the extra speed for the common case. Since Tie::RefHash is available, the additional inconvenience for this rare case is small, so it seems still to be a good trade.

        If I understand correctly, the plan for perl6 is to allow access to alternate behaviours by a different mechanism (declare a property on the hash) which will make it easier to switch in much faster implementations than perl5's tying mechanism allows, but the underlying tradeoff will remain the same - the default will be the implementation that allows fastest key lookup, ie strings.

        Hugo
        Well, perhaps you can. One could write a piece of XS or Inline::C that tries to recreate the reference, by peeking what's at that memory address. But the value might have been garbage collected, and then you run into trouble.

        Abigail

Re: using references as keys in a hash.
by seattlejohn (Deacon) on Feb 23, 2003 at 02:15 UTC
    When you stringify a reference, it no longer is a reference -- it's just a plain old string that happens to contain a human-readable representation of an address and a data type. So the code above is doing just what it would do if your initial hash assignment read:
    %hash = ("ARRAY(0x80fbb0c)" => "STUFF");
    If you turned on strict, you'd be warned that a string can't be used as an array reference.

    Hash keys must be strings, not references. If you want to "dereference" something, you probably should be storing the reference itself as a hash value, so you can use it directly and not need to dereference it in the first place.

            $perlmonks{seattlejohn} = 'John Clyman';

Re: using references as keys in a hash.
by Zaxo (Archbishop) on Feb 23, 2003 at 02:14 UTC

    Correct, it is not a reference any more. Hash keys are strings, and the reference has been stringified to fit. That is 'planned', so to speak, by the key hashing algorithm.

    That's not to say it is useless. References to distinct variables make fine keys, guaranteed to be unique.

    After Compline,
    Zaxo

      They are only unique as long as the variable(s) exist. If they are garbage collected, Perl will reuse the memory for it, and it could be a new reference stringifies to the same value as the old one.

      Abigail

Re: using references as keys in a hash.
by jonadab (Parson) on Feb 23, 2003 at 12:38 UTC

    Others have pointed out why this is the way it is and pointed you toward modules that will work around it, but I'd like to point out a simpler, more blindingly obvious solution: Go ahead and use the reference as your key, but also store it as a value (in addition to whatever other values you are storing).

    There are two ways to do this, and which one you pick is a matter of style. You can use parallel hashes, with the same key across two or more hashes returning a related set of values, or you can use a nested hash. The latter is easier to make look similar to what you have in your code...

    %nestedhash = { $someref => { ref => $someref, val => "STUFF", }, $anotherref => { ref => $anotherref, val => stuff(), }, }

    Though the way hashes tend to be used in the real world, you're more likely to end up with something more like this...

    while (($ref, $val) = get_pair()) { my %thisrecord = { ref => $ref, val => $val }; $record{$ref} = \%thisrecord; }

    Personally, I tend to use parallel hashes, which accomplishes roughly the same thing in a slightly different way, like so...

    while (($r, $v) = get_pair()) { $ref{$r}=$r; $val{$r}=$v; }

    sub H{$_=shift;while($_){$c=0;while(s/^2//){$c++;}s/^4//;$ v.=(' ','|','_',"\n",'\\','/')[$c]}$v}sub A{$_=shift;while ($_){$d=hex chop;for(1..4){$pl.=($d%2)?4:2;$d>>=1}}$pl}$H= "16f6da116f6db14b4b0906c4f324";print H(A($H)) # -- jonadab
      Personally, I tend to use parallel hashes, ...

      If two data structures are related, make that relationship OBVIOUS. Parallel data structures are not obviously related. In fact, it's a maintenance nightmare.

      Let's set up a thought experiment. There are four parallel data structures. It doesn't matter at all what they are, except they have the following properties:

      • A set of config-type parameters
      • Modified everywhere (whether global or passed around)
      • Within the fubar() function, only three are referenced. (Since every developer knows that the four are parallel, there's no commenting to mention the fourth.)
      I am your maintenance programmer. I come along and are told there is a bug in the fubar() function and I need to fix it in 24 hrs. I go and realize that I need this value to make it right. I don't know that the value is in this fourth data structure. But, I need to fix fubar() right now. So, I add some crazy structure to get that fourth value into fubar(). The code is now worse.

      All of that is avoided by using a second level of data structures. Thus, this set of config-type parameters is handled around as one reference. I, the hapless maintainer, is shown by the very way the data is structured that my needed value is there for me already. I don't need to hack the code up and make my job harder, just to do my job.

      (And, in case you're thinking that this is a contrived thought experiment ... maintenance programmers are often given that exact task, with about that level of knowledge about the system. It's not a perfect world out there. It is our job as developers to think about the maintainer who will come after us. You will maintain at some point in your career and will thank the developer with forethought.)

      (If you think your code won't be maintained, remember this - that's what the mainframe developers in the 1970's thought when they used 2-digit years. I mean, who's going to keep this code around for 30(!) years?)

      ------
      We are the carpenters and bricklayers of the Information Age.

      Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

        If two data structures are related, make that relationship OBVIOUS.

        I agree with that.

        Parallel data structures are not obviously related.

        It seems obvious to me that if you see them assigned together, they're related. I did say it was a matter of style, however, and I expected some people to have a strong preference for the nested structures. I do use the nested structures in some cases, when what I want to do is a little more complex, or if there are multiple levels of nesting, or some other good reason. And I gave the example of using nested structures first. Don't read more into my statement about parallel structures than is there.

        In fact, it's a maintenance nightmare. Let's set up a thought experiment.

        Thought experiments can lead you to conclude that a heavier object will always fall faster than a lighter one. (They can also be useful, but you have to take them cum grano salis.)

        I am your maintenance programmer.

        Oooh, oooh, can I imagine that I named all my variables with single characters and used recursive nested evals wherever possible? ;-)

        I come along and are told there is a bug in the fubar() function and I need to fix it in 24 hrs. I go and realize that I need this value to make it right. I don't know that the value is in this fourth data structure. But, I need to fix fubar() right now. So, I add some crazy structure to get that fourth value into fubar(). The code is now worse.

        The code will always be worse when someone who is not familiar with the code attempts to fix something right now without understanding how it works. No amount of wonderful data structure will change that. (This is not an argument for bad data structures; I'm merely pointing out that no data structure can prevent the scenerio you describe.)

        Furthermore, unless I'm missing something, there's nothing magic about the syntax of nesting that will alert the unfamiliar programmer to the existence of more data than is being used in the piece of code he's viewing. A simplistic example...

        sub foobar { my ($object, $result); foreach $object (@_) { $result .= "Title:\t" . $$object{title} ."\n" . "Author:\t" . $$object{author} ."\n" . "-------------------------------\n"; } return $result; }

        Will the programmer know to look in $$object{ISBN} for the piece of data he needs to fulfill the change request? Maybe, but if so it's not any more obvious than (with parallel structures) looking in $isbn{$key}. If he reads through the well-commented code, he'll find it either way.

        Of course, if the code is more complex and has a larger number of fields, then the nested structure can be traversed more efficiently, avoiding the bug in the first place...

        sub foobar { my ($object, $result, $f); foreach $object (@_) { foreach $f (sort @fields) { $result .= "$f:\t" . $$object{$f} ."\n"; } $result .= "-------------------------------\n"; } return $result; }

        But the original poster is talking about what is currently a single hash storing a single value for each key, and I was suggesting also storing the unstringified reference used to create the hash key. That's a total of two fields: not complex enough to really need the nested structure, IMO. Yes, the nested structure will solve the problem nicely, but the parallel structure will also work.

        Note that I'm not saying that parallel structures are better, or even that they're as good in every case; I only said that which you use is a matter of style. The program will get the same result either way.


        sub H{$_=shift;while($_){$c=0;while(s/^2//){$c++;}s/^4//;$ v.=(' ','|','_',"\n",'\\','/')[$c]}$v}sub A{$_=shift;while ($_){$d=hex chop;for(1..4){$pl.=($d%2)?4:2;$d>>=1}}$pl}$H= "16f6da116f6db14b4b0906c4f324";print H(A($H)) # -- jonadab
        All of that is avoided by using a second level of data structures. ...

        With a set of Unit Tests as insurance.

      Ironically, your code snippets above try to use references as keys, although unintentionally. The lines I'm referring to are   %nestedhash = { and   my %thisrecord = { ref => $ref, val => $val }; Both should be using parentheses instead of curly brackets.

      ihb

        Quite so. That'll teach me to post without testing.


        sub H{$_=shift;while($_){$c=0;while(s/^2//){$c++;}s/^4//;$ v.=(' ','|','_',"\n",'\\','/')[$c]}$v}sub A{$_=shift;while ($_){$d=hex chop;for(1..4){$pl.=($d%2)?4:2;$d>>=1}}$pl}$H= "16f6da116f6db14b4b0906c4f324";print H(A($H)) # -- jonadab
Why use references as keys in a hash?
by Solo (Deacon) on Feb 23, 2003 at 14:32 UTC
    We've seen a few means of using references as hash keys. That's pretty cool, from a theoretical standpoint--but I was wondering what the practical application could be.

    What are examples of using references as hash keys to make problems easier?

    My example, if I wanted to keep a hash of my usernames for websites, and put the prepared HTTP request in the hash.

    use Tie::RefHash; use HTTP::Request; use Data::Dumper; $r1 = HTTP::Request->new(GET => 'http://www.perlmonks.org/'); $r2 = HTTP::Request->new(GET => 'http://use.perl.org/'); # Etc... tie %sites, 'Tie::RefHash', ($r1, 'Solo', $r2, 'Solo'); print Dumper(\%sites);

    It seems too trivial, but that's probably just my lack of vision ;)

    --Solo

    --
    I think my eyes are getting better. Instead of a big dark blur, I see a big light blur.

      I have a few examples. I always use Tie::RefHash:

      • A class (package) that needs to keep track of all its instances and look-up by instance when methods are called.
      • An "object mapper" method in a package that takes two sets of objects from classes derived from that package and maps objects from the first to objects in the second. The resulting map is hash: first set objects are the keys; second set objects are the values.
      • An XML template parser tool that, at one point, associates names to pieces of the XML that are stored as trees. Those trees are hash references. Later, some of the processing requires that given a tree we find its name. That reverse lookup hash uses the tree references as the keys and the names as the values.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://237836]
Approved by blakem
Front-paged by data64
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others musing on the Monastery: (5)
As of 2024-04-20 13:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found