More efficient way to get uniq list elements from list of lists

monkfan has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: More efficient way to get uniq list elements from list of lists by dimar (Curate) on Nov 11, 2004 at 03:44 UTC
Whenever you want 'unique elements' you should immediately be thinking 'hash keys'. `my %hTemp; my @aTemp = map{@{$_}} ([1,2], [1,3], [2,3], [2,4], [3,4]); @hTemp{@aTemp} = (); print join ",",(sort keys %hTemp);` [download]	[reply] [d/l]
Re: More efficient way to get uniq list elements from list of lists by pg (Canon) on Nov 11, 2004 at 03:40 UTC
Use a hash: `use Data::Dumper; use strict; use warnings; sub edges2vertices { my %vertices; for my $i ( 0 .. $#_ ) { $vertices{$_[$i][$_]} = 1 for (0 .. $#{$_[$i]}); } return sort keys %vertices; } my @E = ([1,2], [1,3], [2,3,7,8], [2,4], [3,4]); my @nv= edges2vertices(@E); print Dumper(\@nv);` [download]	[reply] [d/l]
Re^2: More efficient way to get uniq list elements from list of lists by simonm (Vicar) on Nov 11, 2004 at 08:20 UTC
Or more compactly: `sub edges2vertices { sort keys %{ { map { $_ => undef } map @$_, @_ } } }` [download]	[reply] [d/l]
Re^3: More efficient way to get uniq list elements from list of lists by Aighearach (Initiate) on Nov 11, 2004 at 12:25 UTC
This is not only more compact, it's also a different, slower algorithm. See the benchmarked examples in my other post -- Snazzy tagline here	[reply]
Re^4: More efficient way to get uniq list elements from list of lists by simonm (Vicar) on Nov 11, 2004 at 18:52 UTC
Re^5: More efficient way to get uniq list elements from list of lists by Aighearach (Initiate) on Nov 11, 2004 at 23:01 UTC
Re: More efficient way to get uniq list elements from list of lists by Aighearach (Initiate) on Nov 11, 2004 at 04:40 UTC
Since the question was about efficiency, I've done some benchmarks with both the supplied data set, and a larger ( 10_000 x 2) random data set. Here are the results on my lil' old AMD K6-2 400 w/448M RAM: $ ./pm_temp Benchmark: timing 50000 iterations of aighearach, aighearach_longrand, + ewijaya, ewijaya_longrand, pg, pg_longrand, scooterm, scooterm_longr +and... aighearach: 1 wallclock secs ( 0.88 usr + 0.00 sys = 0.88 CPU) @ 56 +818.18/s (n=50000) aighearach_longrand: 1 wallclock secs ( 0.83 usr + 0.00 sys = 0.83 +CPU) @ 60240.96/s (n=50000) ewijaya: 4 wallclock secs ( 3.70 usr + 0.00 sys = 3.70 CPU) @ 13 +513.51/s (n=50000) ewijaya_longrand: 2 wallclock secs ( 3.22 usr + 0.00 sys = 3.22 CPU +) @ 15527.95/s (n=50000) pg: 1 wallclock secs ( 0.86 usr + 0.00 sys = 0.86 CPU) @ 58 +139.53/s (n=50000) pg_longrand: 1 wallclock secs ( 0.99 usr + 0.00 sys = 0.99 CPU) @ 5 +0505.05/s (n=50000) scooterm: 2 wallclock secs ( 0.95 usr + 0.01 sys = 0.96 CPU) @ 52 +083.33/s (n=50000) scooterm_longrand: 2 wallclock secs ( 0.86 usr + 0.00 sys = 0.86 CP +U) @ 58139.53/s (n=50000) Rate ewijaya ewijaya_longrand pg_longrand scoot +erm aighearach scooterm_longrand pg aighearach_longrand ewijaya 13514/s -- -13% -73% - +74% -76% -77% -77% -78% ewijaya_longrand 15528/s 15% -- -69% - +70% -73% -73% -73% -74% pg_longrand 50505/s 274% 225% -- +-3% -11% -13% -13% -16% scooterm 52083/s 285% 235% 3% + -- -8% -10% -10% -14% aighearach 56818/s 320% 266% 12% + 9% -- -2% -2% -6% scooterm_longrand 58140/s 330% 274% 15% +12% 2% -- -0% -3% pg 58140/s 330% 274% 15% +12% 2% 0% -- -3% aighearach_longrand 60241/s 346% 288% 19% +16% 6% 4% 4% -- [download] Here is the code: Read more... (2 kB)	[reply] [d/l] [select]
Re^2: More efficient way to get uniq list elements from list of lists by monkfan (Curate) on Nov 11, 2004 at 05:10 UTC
One posting to rule them all! Thanks so much for your exaustive evaluation and suggestion, Aighearach. Regards, Edward	[reply]
Re: More efficient way to get uniq list elements from list of lists by Aighearach (Initiate) on Nov 11, 2004 at 03:55 UTC
In the first place you could have replaced `push @vertices` with `$unique{ $edges[$i][$j] } = undef` and then you don't have to mess with shoehorning `@uniqv` into a hash just to pull out the keys. Abstracting out the edges and vertices, what I see is going on is, you have a list of lists, and you just want the elements that are unique across the whole thing. I would do something like: sub unique_elements { my ( %unique ); # @_ could be large, so I don't want to put the list in memory, or + even put a list of indices into mem like you do with 0 .. $#_ for ( my $i = 0; $i < @_; $i++ ) { # but the inner lists are a small fixed size, so even though t +he sollution is generalized, the inner values I do just stick in a li +st, which I use for a hash slice to autovivify the keys @unique{@{$_[$i]}} = undef; # black magic warning! the first e +lement in the slice is assigned undef, and the rest are autovivfied } return sort keys %unique; } [download] -- Snazzy tagline here	[reply] [d/l] [select]
Re: More efficient way to get uniq list elements from list of lists by Zaxo (Archbishop) on Nov 12, 2004 at 03:26 UTC
Judging by your variable and sub names, you may want to look at Graph.pm. It may not be most efficient at this particular function, but it will certainly be more efficient over the whole application - more efficient of your development time if nothing else. Written in terms of Graph.pm, your code looks like this: `use Graph; my @K = ([1,2], [1,3], [2,3], [2,4], [3,4]); my $g = Graph->new; $g->add_edges( map {@$_} @K ); print $g->vertices;` [download] After Compline, Zaxo	[reply] [d/l]