Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re: RFC on Inline::C hack: Hash_Iterator

by esskar (Deacon)
on Aug 01, 2005 at 12:27 UTC ( #479867=note: print w/replies, xml ) Need Help??


in reply to RFC on Inline::C hack: Hash_Iterator

nice job;
is it possible to add a "prev" method?
and where can I find the documentation to HeNEXT ?
  • Comment on Re: RFC on Inline::C hack: Hash_Iterator

Replies are listed 'Best First'.
Re^2: RFC on Inline::C hack: Hash_Iterator
by tlm (Prior) on Aug 01, 2005 at 12:55 UTC

    ...and where can I find the documentation to HeNEXT ?

    HeNEXT is a macro (surprise!), defined in hv.h as

    #define HeNEXT(he) (he)->hent_next
    where hent_next is a field in the following struct (also from hv.h):
    typedef struct he HE; /* ... */ /* entry in hash value chain */ struct he { HE *hent_next; /* next entry in chain */ HEK *hent_hek; /* hash key */ SV *hent_val; /* scalar value that was hashed */ };
    So basically HeNEXT scoots along a linked list of hash entries (those belonging to a given bucket, to be precise). As you can see, the data structure only provides for travel in one direction. Therefore, I don't see an easy way to implement a prev method. If I had to have it, I think I'd just accept the memory hit and use the keys array-based implementation I described briefly in my OP. Then prev reduces to decrementing a cursor variable.

    the lowliest monk

      So basically HeNEXT scoots along a linked list of hash entries (those belonging to a given bucket, to be precise).

      Well.. you could always make use of a doubly-linked list to get around this. For the sake of the storage of a second pointer in your structure, this would seem a reasonable trade-off to give a more flexible iteration method.

      Just on a scalability note, and to raise awareness of a possible stumbling block if this code were to be used as anything other than an iterator, I'm always wary of a hash bucket containing a linked list. In this example, there's little to no issue - it's designed to be an iterator, and hence the list will always be traversed from start to end.

      For (most) implementations, however, where more random access of hashed elements is required, an iterative method is really quite inefficient, requiring up to time const + N to look up any entry. In these cases, it's often better to use something akin to a binary tree, offering at worst time const + log N to find any entry.

      For true flexibility, a linked list threaded through binary tree has been my structure of choice for awhile - offering a sane method of iterating through the structure while retaining a reasonable random-element-access time.

      Nice work - ++tlm.

        Well.. you could always make use of a doubly-linked list to get around this. For the sake of the storage of a second pointer in your structure, this would seem a reasonable trade-off to give a more flexible iteration method.

        The structures you are talking about are defined by Perl itself. Its not a design decision available to anybody but the pumpkings, and its unlikely they would appreciate the cost given the minimal benefits it would provide.

        For (most) implementations, however, where more random access of hashed elements is required, an iterative method is really quite inefficient, requiring up to time const + N to look up any entry. In these cases, it's often better to use something akin to a binary tree, offering at worst time const + log N to find any entry.

        Im not sure if I agree with this analysis. The linked lists used for buckets in perls hashes are intended to be extremely small, ie, generally they should hold only one element, and except for degenerate cases should not really exceed two elements. With this in mind a binary tree approach makes less sense as in most cases you will derive no benefit from it at all.

        Perls hashes dont allow for duplicate keys, which means that buckets only contain multiples when there are hash-key collisions. Such collisions should be unusual, and overly long bucket chains will be redistributed to other buckets when a resize event occurs, and iirc overly long bucket chains are precisely the determinant for such resize events.

        ---
        $world=~s/war/peace/g

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://479867]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (2)
As of 2023-01-29 20:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?