comment on

Maybe it is just semantics, but I don't think map is the same algorithm as for(;;){} at the language level. At a higher level, sure, but here we're only comparing Perl implementations, so I'm assuming "algorithm" at that level.

Really I should have used a really big list for longrand, like a million elements or something. Because that was my intention, to show the scalability difference between correctly iterating over the arguments, whereas my results only hinted that there are scalability differences. Here they are with yours added and just a large data set, 1_000_000 this time. Note that we're takling about simonm() that was what you had posted, not this new simonm2() which is more of an improvement, and definately the best of the bunch.

Here are my results with a 1_000_000x2 data set.

                 Rate ewijaya_l simonm_l pg_l scooterm_l aighearach_l 
+aighearach2_l simonm2_l
ewijaya_l     13736/s        --     -48% -69%       -74%         -76% 
+         -78%      -80%
simonm_l      26316/s       92%       -- -41%       -51%         -54% 
+         -57%      -62%
pg_l          44248/s      222%      68%   --       -17%         -23% 
+         -28%      -36%
scooterm_l    53191/s      287%     102%  20%         --          -7% 
+         -14%      -23%
aighearach_l  57471/s      318%     118%  30%         8%           -- 
+          -7%      -17%
aighearach2_l 61728/s      349%     135%  40%        16%           7% 
+           --      -11%
simonm2_l     69444/s      406%     164%  57%        31%          21% 
+          13%        --
[download]

THe aighearach2 is my try, but with your map to populate the slice, as follows:

sub aighearach2 {
    my ( %unique );
    for ( my $i = 0; $i < @_; $i++ ) {
        @unique{ map @$_, @_ } = ();
    }
    return keys %unique;
}
[download]

I find it interesting how the implementations start to really seperate from each other on larger data sets. I imagine that if they were run through Devel::DProf or something, it would be found that memory consumption is the big difference. If anybody is still following this thread, that would be interesting to see...

Your results are a bit different than mine, perhaps because of the different platform.

I guess it is the increment in aighearach2() that makes simonm2() 13%(!) faster

An important lesson can be learned, I think, by studying closely the changes you made between simonm() and simonm2(). A good contrast between an anon hash that is used simplistically, and a named hash with it's full power unleashed.

--
Snazzy tagline here

In reply to Re^5: More efficient way to get uniq list elements from list of lists by Aighearach
in thread More efficient way to get uniq list elements from list of lists by monkfan

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.