in reply to Dotted hash access

As you can read in another node by me in this thread, I also dislike the typing exercise that one needs to practice every time a deep HoHoHoH element is needed. However, I don't like joining all keys together, because that makes iterating or assigning a reference to a deeper hash hard, or impossible, depending on the time available for hacking up ugly solutions.

Still, if I would join keys together, I'd do so with Perl's own built-in mechanism for that. Supply a list as a hash key and perl automatically joins it with $;. It'd be nice if there was an interpolating qw. :)

Juerd # { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }

Replies are listed 'Best First'.
Re^2: Dotted hash access
by sfink (Deacon) on Nov 28, 2004 at 19:08 UTC
    I'm not actually joining anything; I'm using blessed-hashes-of-blessed-hashes. (I replied this privately a minute ago, but then it sparked an idea.)

    But that gives me an idea, or perhaps it's what you already meant: it would be much better to use the same underlying implementation, but switch to using $; rather than a period as a separator, so that the example would be:

    $cost = $h->{'Locations',$location,'Buildings',$building,'cost'};

    No new syntax that way, although it does use an unfamiliar one in these post-Perl4 days. But also no smushing of keys together into one string, even if it's only temporary.

    And the more I think about it, the more it looks like this is what you meant -- since my immediate reaction to typing in the example was that an interpolating qw would be really nice! But your point about it being difficult to iterate over or assign to a deeper level isn't a problem with my current implementation.

    I think I'll go change my code to take an optional separator string parameter, defaulting to $;, so that you can do it either way. I'm not sure yet which I'll use; the lack of interpolation with the $; approach defeats much of the benefit.

    As for Perl6 -- we could always add in more than one interpolating context. You'd still need syntax to select them, of course. How about

    $code = %h{''Locations $location Buildings $building cost''};
    or maybe
    $code = %h{''Locations'$location'Buildings'$building'cost''};
    Ironically, I think this is already possible in Parrot with multipart keys, using PIR's
    $P1['Locations';$S1;'Buildings';$S2;'cost']
    The current aggregates' code will pass on any leftover portions of a key to the aggregate it just retrieved.

      The more I read this thread the more I dont understand why you dont maintain a hash that is structured with only two levels, location and then building. Then your code looks like:

      my $code=$locations{$location}{$building}{code};

      Also something to keep in mind (although its not hugely critical) each deref takes time, each hash lookup takes time, each unique key takes space. So in some circumstances your dotted approach would result in considerably more memory being taken up by the keys. Not only that but determinisitc traversal of your dotted form of the tree would be quite expensive as compared to the non dotted form. Overall I wouldnt go this route unless i had really strong justification to do so. And style isnt a strong justification IMO :-)

      ---
      demerphq

        The only way in which my exact data structure is relevant is if nothing like it should ever be created -- i.e., if I am making it far more complicated than it needs to be. I haven't given enough details about my data for anyone to come to that conclusion. That wasn't wholly unintentional. I wanted the meditation to be about a general problem that I occasionally encounter, and was wondering if other people also encountered.

        For the record, your proposed refactoring of the data structure wouldn't work in my case because 'Locations' is one key of many at the top level. The whole structure is intended to represent an empire, which controls a set of locations, has various technologies, a name, etc. Some of those are simple values, some are themselves nested structures. A 'Location' (identified by a name) has a set of buildings, but also resources, a description, an inventory, etc. So simplifying things in the manner you suggest would result in Locations colliding with the strings 'Name', 'Technologies', etc.

        As for dereferences taking time -- um, my module uses a hash tie, and on every lookup it splits apart the key and does a recursive lookup. So it's far, far slower than just a set of dereferences! Performance is really not a concern. This is an exploration into readability.

        But your comment about keys suggests that you misunderstand what I'm doing. The actual data structure is merely a HoHoHo...H. I am never storing any keys in their dotted form. Doing so would prevent me from grabbing out pieces of the structure:

        $location_info = $h->{"Locations.$location"}; print "$location description: $location_info->{Description}\n";
        I am just providing a hash reference that, in addition to being addressable the normal way, can also be addressed with an alternate dotted syntax. The internal structure is unchanged, and a deterministic traversal just requires a regular ASCIIbetical sort of the keys at each level (same as any other HoH.)

        Sorry if this wasn't clear from the original post.