Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Hash Set

by tomazos (Deacon)
on Sep 11, 2005 at 06:51 UTC ( [id://491022]=perlquestion: print w/replies, xml ) Need Help??

tomazos has asked for the wisdom of the Perl Monks concerning the following question:

I want a set.

I think the best way is to use a hash to implement a set by using the existant keys to represent the current members of the set.

My current strategy is along the lines of:

my %set; sub contains($) { return exists $set{$_[0]} } sub add($) { $set{$_[0]} = 1 } sub all($) { return [keys %set] }

(I'm not actually using subs -- just for clarity.)

Is this approach the simplest way?

Is there a better approach avoiding the assignment for the add? Something like create $set{'foo'}?

I think I don't need the hash entries to be defined. I just need them to exist. (If you iterate over keys it will still return entries that have existant but undefined values right?)

-Andrew.


Andrew Tomazos  |  andrew@tomazos.com  |  www.tomazos.com

Replies are listed 'Best First'.
Re: Hash Set
by ambrus (Abbot) on Sep 11, 2005 at 07:15 UTC

    I think the best way is to use a hash to implement a set by using the existant keys to represent the current members of the set.

    ...

    Is this approach the simplest way?

    Probably yes, but really it depends on what operations you want to do on the set.

    If you, for example, never want to check if certain elements are included, you only want to iterate over the set, then it's better to use an array (or a list). This case is very common, just most people don't call that a set, but a list. C++ STL doesn't call that a set either.

    Also, if the set consists of small integers, and you don't want to iterate over them, then it may be better to use an array like

    my @set; sub contains { @set{$_[0]}; } sub add { @set{$_[0]} = 1; }

    Update: parv is right, I don't know what I was thinking when I wrote the above code. The right code is

    my @set; sub contains { $set[$_[0]]; } sub add { $set[$_[0]] = 1; }

    Or you can use a bit vector like

    my $set = ""; sub contains { vec($set, $_[0], 1); } sub add { vec($set, $_[0], 1) = 1; }
    Is there a better approach avoiding the assignment for the add? Something like create $set{'foo'}?

    Not really. Some people (me too sometimes) use $set{"foo"}++; but that's not really better at all, only shorter.

    I think I don't need the hash entries to be defined. I just need them to exist. (If you iterate over keys it will still return entries that have existant but undefined values right?)

    That's right.

    If you do not insist on values to be defined, just exist, the code you gave should still work. Moreover, there's a little advantage in that approach: you can add multiple elements to the set with a shorter syntax:

    $set{@elements} = ();
    I however sometimes still represent a set in such a way that all values are 1, because if I have to test for inclusion in the set many times in the code, I can just write $set{$elt} instead of exists($set{$elt}) that way.
      > Also, if the set consists of small integers, and you 
      > don't want to iterate over them, then it may be better
      > to use an array like
      >
      > my @set;
      > sub contains { @set{$_[0]}; }
      > sub add { @set{$_[0]} = 1; }
      

      It seems -- from the operations in the subs -- you have declared a hash called %set somewhere not shown. That, or you are confused about the hash and array operations, in which case correct syntax for working w/ the array would be ...

      sub contains { $set[ $_[0] ]; } sub add { $set[ $_[0] ] = 1; }

      - parv

      A list can contain multiple identical elements. A set cannot.

      I want it to be able to contain any scalar (not just integers).

      Specifically the operations I want to do are as per the subs I gave in my example:

      • add an element (no-op if already a member)
      • test if an element is contained
      • iterate over all the elements

      -Andrew.


      Andrew Tomazos  |  andrew@tomazos.com  |  www.tomazos.com
Re: Hash Set
by parv (Parson) on Sep 11, 2005 at 08:05 UTC

    Yes, you are correct in that values do not need to be defined. So in your add(), you could just assign undef.

    BTW, have you looked at set modules available via CPAN? They may come handy (no personal experience) if you would need yet unstated operations at a later date. If nothing, they may give you ideas about other implementaions.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://491022]
Approved by lidden
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (6)
As of 2024-04-23 18:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found