monkfan has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,
I have this subroutine that try to convert list of list to get unique element of it into array
I just wonder if beloved monks have a more efficient and compact way to do this clumsy subroutines of mine.
sub edges2vertices { #AoA to uniq array elem; my @edges = @_; my @vertices; my @uniqv; for my $i ( 0 .. $#edges ) { for my $j ( 0 .. $#{$edges[$i]} ) { push @vertices, $edges[$i][$j]; } } @uniqv = sort keys %{{map {$_,1} @vertices}}; return @uniqv; }
So that with
my @E = ([1,2], [1,3], [2,3], [2,4], [3,4]); my @nv; @nv = edges2vertices(@E); # will return: __END__ $VAR1 = [ '1', '2', '3', '4' ];
Regards,
Edward

Replies are listed 'Best First'.
Re: More efficient way to get uniq list elements from list of lists
by dimar (Curate) on Nov 11, 2004 at 03:44 UTC

    Whenever you want 'unique elements' you should immediately be thinking 'hash keys'.

    my %hTemp; my @aTemp = map{@{$_}} ([1,2], [1,3], [2,3], [2,4], [3,4]); @hTemp{@aTemp} = (); print join ",",(sort keys %hTemp);
Re: More efficient way to get uniq list elements from list of lists
by pg (Canon) on Nov 11, 2004 at 03:40 UTC

    Use a hash:

    use Data::Dumper; use strict; use warnings; sub edges2vertices { my %vertices; for my $i ( 0 .. $#_ ) { $vertices{$_[$i][$_]} = 1 for (0 .. $#{$_[$i]}); } return sort keys %vertices; } my @E = ([1,2], [1,3], [2,3,7,8], [2,4], [3,4]); my @nv= edges2vertices(@E); print Dumper(\@nv);
      Or more compactly:
      sub edges2vertices { sort keys %{ { map { $_ => undef } map @$_, @_ } } }
        This is not only more compact, it's also a different, slower algorithm. See the benchmarked examples in my other post

        --
        Snazzy tagline here
Re: More efficient way to get uniq list elements from list of lists
by Aighearach (Initiate) on Nov 11, 2004 at 04:40 UTC
    Since the question was about efficiency, I've done some benchmarks with both the supplied data set, and a larger ( 10_000 x 2) random data set. Here are the results on my lil' old AMD K6-2 400 w/448M RAM:
    $ ./pm_temp Benchmark: timing 50000 iterations of aighearach, aighearach_longrand, + ewijaya, ewijaya_longrand, pg, pg_longrand, scooterm, scooterm_longr +and... aighearach: 1 wallclock secs ( 0.88 usr + 0.00 sys = 0.88 CPU) @ 56 +818.18/s (n=50000) aighearach_longrand: 1 wallclock secs ( 0.83 usr + 0.00 sys = 0.83 +CPU) @ 60240.96/s (n=50000) ewijaya: 4 wallclock secs ( 3.70 usr + 0.00 sys = 3.70 CPU) @ 13 +513.51/s (n=50000) ewijaya_longrand: 2 wallclock secs ( 3.22 usr + 0.00 sys = 3.22 CPU +) @ 15527.95/s (n=50000) pg: 1 wallclock secs ( 0.86 usr + 0.00 sys = 0.86 CPU) @ 58 +139.53/s (n=50000) pg_longrand: 1 wallclock secs ( 0.99 usr + 0.00 sys = 0.99 CPU) @ 5 +0505.05/s (n=50000) scooterm: 2 wallclock secs ( 0.95 usr + 0.01 sys = 0.96 CPU) @ 52 +083.33/s (n=50000) scooterm_longrand: 2 wallclock secs ( 0.86 usr + 0.00 sys = 0.86 CP +U) @ 58139.53/s (n=50000) Rate ewijaya ewijaya_longrand pg_longrand scoot +erm aighearach scooterm_longrand pg aighearach_longrand ewijaya 13514/s -- -13% -73% - +74% -76% -77% -77% -78% ewijaya_longrand 15528/s 15% -- -69% - +70% -73% -73% -73% -74% pg_longrand 50505/s 274% 225% -- +-3% -11% -13% -13% -16% scooterm 52083/s 285% 235% 3% + -- -8% -10% -10% -14% aighearach 56818/s 320% 266% 12% + 9% -- -2% -2% -6% scooterm_longrand 58140/s 330% 274% 15% +12% 2% -- -0% -3% pg 58140/s 330% 274% 15% +12% 2% 0% -- -3% aighearach_longrand 60241/s 346% 288% 19% +16% 6% 4% 4% --
    Here is the code:
      One posting to rule them all!
      Thanks so much for your exaustive evaluation and suggestion, Aighearach.
      Regards,
      Edward
Re: More efficient way to get uniq list elements from list of lists
by Aighearach (Initiate) on Nov 11, 2004 at 03:55 UTC
    In the first place you could have replaced push @vertices with $unique{ $edges[$i][$j] } = undef and then you don't have to mess with shoehorning @uniqv into a hash just to pull out the keys.

    Abstracting out the edges and vertices, what I see is going on is, you have a list of lists, and you just want the elements that are unique across the whole thing.

    I would do something like:

    sub unique_elements { my ( %unique ); # @_ could be large, so I don't want to put the list in memory, or + even put a list of indices into mem like you do with 0 .. $#_ for ( my $i = 0; $i < @_; $i++ ) { # but the inner lists are a small fixed size, so even though t +he sollution is generalized, the inner values I do just stick in a li +st, which I use for a hash slice to autovivify the keys @unique{@{$_[$i]}} = undef; # black magic warning! the first e +lement in the slice is assigned undef, and the rest are autovivfied } return sort keys %unique; }

    --
    Snazzy tagline here
Re: More efficient way to get uniq list elements from list of lists
by Zaxo (Archbishop) on Nov 12, 2004 at 03:26 UTC

    Judging by your variable and sub names, you may want to look at Graph.pm. It may not be most efficient at this particular function, but it will certainly be more efficient over the whole application - more efficient of your development time if nothing else.

    Written in terms of Graph.pm, your code looks like this:

    use Graph; my @K = ([1,2], [1,3], [2,3], [2,4], [3,4]); my $g = Graph->new; $g->add_edges( map {@$_} @K ); print $g->vertices;

    After Compline,
    Zaxo