in reply to Re: promoting array to a hash
in thread promoting array to a hash

A slightly quicker version of dragonchilds unique() sub.

sub uniq2{ my %x; @x{ @_ } = (); keys %x }

Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail

Replies are listed 'Best First'.
Re^3: promoting array to a hash
by saskaqueer (Friar) on Jun 13, 2004 at 08:02 UTC

    very important update: Please see Re^4: promoting array to a hash by BrowserUk for why the following code is horrifically wrong. Of course, that said, it requires one very simple s!my!our! to correct.

    I know that the map vs. slice benchmark has been done before, but just to do it again as a reminder :)

    #!perl -w use strict; use Benchmark ':all'; my @unsorted = map { join('', map { ('a'..'z','A'..'Z',0..9)[rand 62] } 1..50) } 1..5000; sub uniq_dragonchild { my %x; @x{@_} = @_; values %x } sub uniq_BrowserUk { my %x; @x{@_} = (); keys %x } sub uniq_Zaxo { keys %{ { map { $_ => undef } @_ } } } cmpthese( timethese(-60, { uniq_dragonchild => 'my @x = uniq_dragonchild(@unsorted)', uniq_BrowserUk => 'my @x = uniq_BrowserUk(@unsorted)', uniq_Zaxo => 'my @x = uniq_Zaxo(@unsorted)' } ) ); __END__ C:\>uniq.pl Benchmark: running uniq_BrowserUk, uniq_Zaxo, uniq_dragonchild for at +least 60 C PU seconds... uniq_BrowserUk: 64 wallclock secs (63.19 usr + 0.02 sys = 63.20 CPU) +@ 421025.4 1/s (n=26610069) uniq_Zaxo: 59 wallclock secs (60.08 usr + 0.03 sys = 60.11 CPU) @ 18 +6939.31/s (n=11237109) uniq_dragonchild: 64 wallclock secs (63.05 usr + 0.02 sys = 63.06 CPU +) @ 399674 .64/s (n=25204682) Rate uniq_Zaxo uniq_dragonchild uniq_Bro +wserUk uniq_Zaxo 186939/s -- -53% + -56% uniq_dragonchild 399675/s 114% -- + -5% uniq_BrowserUk 421025/s 125% 5% + --
      Adding this sub:
      sub uniq_Roy { my %x; grep {!$x{$_}++} @_ }
      I got these results
      Rate uniq_Zaxo uniq_dragonchild uniq_BrowserUk + uniq_Roy uniq_Zaxo 167192/s -- -19% -27% + -47% uniq_dragonchild 205487/s 23% -- -11% + -35% uniq_BrowserUk 230383/s 38% 12% -- + -27% uniq_Roy 317586/s 90% 55% 38% + --

      We're not really tightening our belts, it just feels that way because we're getting fatter.
        Excellent! Yours is faster and will DWIM with references. Good deal! :-)

        ------
        We are the carpenters and bricklayers of the Information Age.

        Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

        I shouldn't have to say this, but any code, unless otherwise stated, is untested

        Unfortunately, the benchmark you started with wasn't good. See Re^4: promoting array to a hash.


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "Think for yourself!" - Abigail

      Unfortunately, your benchmark is wrong. As the array @unsorted is lexical, it isn't visible within the benchmark code, so the subs are uniquifying an empty array. All your benchmark is measuring is how quickly the sub return nothing. Correcting that, I get these results.

      #!perl -w use strict; use Benchmark ':all'; our @unsorted = map { join('', map { ('a'..'z','A'..'Z',0..9)[rand 62] } 1..50) } 1..5000; sub uniq_dragonchild { my %x; @x{@_} = @_; values %x } sub uniq_BrowserUk { my %x; @x{@_} = (); keys %x } sub uniq_Zaxo { keys %{ { map { $_ => undef } @_ } } } sub uniq_Roy { my %x; grep {!$x{$_}++} @_ } our( @a, @b, @c, @d ); cmpthese( -1, { uniq_dragonchild => '@a = uniq_dragonchild( @unsorted )', uniq_BrowserUk => '@b = uniq_BrowserUk( @unsorted )', uniq_Zaxo => '@c = uniq_Zaxo(@unsorted)', uniq_Roy => '@d = uniq_Roy( @unsorted )', }); #my $a='a'; print $/, $a++, ' : ', "@{[ sort @$_ ]}" for ( \@a, \@b, \ +@c, \@d ); __END__ P:\test>363760 Rate uniq_dragonchild uniq_Zaxo uniq_Roy uniq +_BrowserUk uniq_dragonchild 18.1/s -- -21% -31% + -41% uniq_Zaxo 22.8/s 26% -- -13% + -25% uniq_Roy 26.3/s 45% 15% -- + -14% uniq_BrowserUk 30.6/s 69% 34% 16% + --

      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "Think for yourself!" - Abigail

        Oh ouch, that bit me good. I can't believe I didn't find 420000+ iterations per second too many, considering it is handling a 5000 item list. Thanks BrowserUk (and obviously ++).

Re^3: promoting array to a hash
by dragonchild (Archbishop) on Jun 14, 2004 at 01:30 UTC
    It will be slightly quicker. However, it is of less use. My version will DWIM references while yours won't.

    ------
    We are the carpenters and bricklayers of the Information Age.

    Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

    I shouldn't have to say this, but any code, unless otherwise stated, is untested

      Finding unique references and then sorting them huh?


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "Think for yourself!" - Abigail
        Who said anything about sorting them? And, this has been very useful when dealing with lists of singletons. :-)

        ------
        We are the carpenters and bricklayers of the Information Age.

        Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

        I shouldn't have to say this, but any code, unless otherwise stated, is untested