in reply to promoting array to a hash

sub unique { my %x;@x{@_}=@_;values %x} my @sorted_unique = sort unique (split ' ', do { local $\=undef;<> });

In other words, use the hash.

------
We are the carpenters and bricklayers of the Information Age.

Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

I shouldn't have to say this, but any code, unless otherwise stated, is untested

Replies are listed 'Best First'.
Re^2: promoting array to a hash
by BrowserUk (Patriarch) on Jun 13, 2004 at 05:24 UTC

    A slightly quicker version of dragonchilds unique() sub.

    sub uniq2{ my %x; @x{ @_ } = (); keys %x }

    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail

      very important update: Please see Re^4: promoting array to a hash by BrowserUk for why the following code is horrifically wrong. Of course, that said, it requires one very simple s!my!our! to correct.

      I know that the map vs. slice benchmark has been done before, but just to do it again as a reminder :)

      #!perl -w use strict; use Benchmark ':all'; my @unsorted = map { join('', map { ('a'..'z','A'..'Z',0..9)[rand 62] } 1..50) } 1..5000; sub uniq_dragonchild { my %x; @x{@_} = @_; values %x } sub uniq_BrowserUk { my %x; @x{@_} = (); keys %x } sub uniq_Zaxo { keys %{ { map { $_ => undef } @_ } } } cmpthese( timethese(-60, { uniq_dragonchild => 'my @x = uniq_dragonchild(@unsorted)', uniq_BrowserUk => 'my @x = uniq_BrowserUk(@unsorted)', uniq_Zaxo => 'my @x = uniq_Zaxo(@unsorted)' } ) ); __END__ C:\>uniq.pl Benchmark: running uniq_BrowserUk, uniq_Zaxo, uniq_dragonchild for at +least 60 C PU seconds... uniq_BrowserUk: 64 wallclock secs (63.19 usr + 0.02 sys = 63.20 CPU) +@ 421025.4 1/s (n=26610069) uniq_Zaxo: 59 wallclock secs (60.08 usr + 0.03 sys = 60.11 CPU) @ 18 +6939.31/s (n=11237109) uniq_dragonchild: 64 wallclock secs (63.05 usr + 0.02 sys = 63.06 CPU +) @ 399674 .64/s (n=25204682) Rate uniq_Zaxo uniq_dragonchild uniq_Bro +wserUk uniq_Zaxo 186939/s -- -53% + -56% uniq_dragonchild 399675/s 114% -- + -5% uniq_BrowserUk 421025/s 125% 5% + --
        Adding this sub:
        sub uniq_Roy { my %x; grep {!$x{$_}++} @_ }
        I got these results
        Rate uniq_Zaxo uniq_dragonchild uniq_BrowserUk + uniq_Roy uniq_Zaxo 167192/s -- -19% -27% + -47% uniq_dragonchild 205487/s 23% -- -11% + -35% uniq_BrowserUk 230383/s 38% 12% -- + -27% uniq_Roy 317586/s 90% 55% 38% + --

        We're not really tightening our belts, it just feels that way because we're getting fatter.

        Unfortunately, your benchmark is wrong. As the array @unsorted is lexical, it isn't visible within the benchmark code, so the subs are uniquifying an empty array. All your benchmark is measuring is how quickly the sub return nothing. Correcting that, I get these results.

        #!perl -w use strict; use Benchmark ':all'; our @unsorted = map { join('', map { ('a'..'z','A'..'Z',0..9)[rand 62] } 1..50) } 1..5000; sub uniq_dragonchild { my %x; @x{@_} = @_; values %x } sub uniq_BrowserUk { my %x; @x{@_} = (); keys %x } sub uniq_Zaxo { keys %{ { map { $_ => undef } @_ } } } sub uniq_Roy { my %x; grep {!$x{$_}++} @_ } our( @a, @b, @c, @d ); cmpthese( -1, { uniq_dragonchild => '@a = uniq_dragonchild( @unsorted )', uniq_BrowserUk => '@b = uniq_BrowserUk( @unsorted )', uniq_Zaxo => '@c = uniq_Zaxo(@unsorted)', uniq_Roy => '@d = uniq_Roy( @unsorted )', }); #my $a='a'; print $/, $a++, ' : ', "@{[ sort @$_ ]}" for ( \@a, \@b, \ +@c, \@d ); __END__ P:\test>363760 Rate uniq_dragonchild uniq_Zaxo uniq_Roy uniq +_BrowserUk uniq_dragonchild 18.1/s -- -21% -31% + -41% uniq_Zaxo 22.8/s 26% -- -13% + -25% uniq_Roy 26.3/s 45% 15% -- + -14% uniq_BrowserUk 30.6/s 69% 34% 16% + --

        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "Think for yourself!" - Abigail
      It will be slightly quicker. However, it is of less use. My version will DWIM references while yours won't.

      ------
      We are the carpenters and bricklayers of the Information Age.

      Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

      I shouldn't have to say this, but any code, unless otherwise stated, is untested

        Finding unique references and then sorting them huh?


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "Think for yourself!" - Abigail