Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: Perl Idioms Explained - keys %{{map{$_=>1}@list}}

by thinker (Parson)
on Aug 04, 2003 at 15:46 UTC ( [id://280714]=note: print w/replies, xml ) Need Help??


in reply to Perl Idioms Explained - keys %{{map{$_=>1}@list}}

Hi broquaint,

or, to keep the order that the items are inserted (ie. the order in which the first instance of a value is encountered)

my %seen; @uniq = grep ! $seen{$_}++, @list;

cheers

thinker

Replies are listed 'Best First'.
Re: Re: Perl Idioms Explained - keys %{{map{$_=>1}@list}}
by LanceDeeply (Chaplain) on Aug 04, 2003 at 18:58 UTC
    This is the way I've usually seen it done. I was curious, so I ran a benchmark against the two.
    use Benchmark; my @list; for ( 0..9999 ) { push @list, sprintf "%d", 100 * rand ; } timethese( 1000, { 'keys_map' => sub { my @uniq = keys %{{ map {$_ => 1} @list }} +; }, 'grep_seen' => sub { my %seen; my @uniq = grep ! $seen{$_}+ ++, @list; }, } );
    Yields the following output.

    Benchmark: timing 1000 iterations of grep_seen, keys_map... grep_seen: 13 wallclock secs (11.17 usr + 0.00 sys = 11.17 CPU) @ 89 +.52/s (n=1000) keys_map: 30 wallclock secs (29.28 usr + 0.00 sys = 29.28 CPU) @ 34 +.15/s (n=1000)
      Adding
      'keys_map_undef' => sub { my @uniq = keys %{{ map {$_ => undef} @list +}}; },
      to test the undef suggestion, it turns out to be 15-20% faster than using 1.

      grep still wins, though.
      --
      Mike

Re: Re: Perl Idioms Explained - keys %{{map{$_=>1}@list}}
by Jasper (Chaplain) on Aug 04, 2003 at 22:21 UTC
    You can also grep lists for certain 'numbers' of entries (so if you wanted only the items that were in a list twice)
    @doubles = grep ++$seen{$_} == 2, @list;
    Jasper

      That would be 2 or more times right? Once ++$seen{$_} == 2 is true, you are immediatelly copying $_ to @doubles. So if it appeared a third time, you couldn't magically remove it again. Your code would be clearer if you used >= to emphasize that.

      The following would probably do if you only wanted entries with exactly 2 occurences:

      @doubles = grep $seen{$_} == 2, grep !$seen{$_}++, @list;

      But it is not pretty, or obvious. The first grep counts all occurences and only passes the first found entry to the next grep which will check to see how many were actually found.

      There must be a better way!

      - Cees

      No, that won't fly. You need
      $seen{$_}++ for @list; my @doubles = grep $seen{$_} == 2, @list;

      Makeshifts last the longest.

Re^2: Perl Idioms Explained - keys %{{map{$_=>1}@list}}
by Aristotle (Chancellor) on Aug 04, 2003 at 20:12 UTC
    Note that this is slightly broken as is. To be entirely correct, you have to say
    my (%seen, $seen_undef); my @uniq = grep defined() ? !$seen{$_}++ : !$seen_undef++, @list;
    Of course, if you're fiddling with objects which cannot be compared for equity by stringification, it is still broken.

    Makeshifts last the longest.

      Of course, if you're fiddling with objects which cannot be compared for equity by stringification, it is still broken.

      Well, so is every uniquification based on the keys of a hash! The point still stands that grep is faster than the original idiom presented, by orders of magnitude, if still as memory-hungry.

      ------
      We are the carpenters and bricklayers of the Information Age.

      The idea is a little like C++ templates, except not quite so brain-meltingly complicated. -- TheDamian, Exegesis 6

      Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

        I didn't dispute that. :) I just pointed out some of the more subtle points to keep in mind here.

        Makeshifts last the longest.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://280714]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (7)
As of 2024-04-23 17:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found