in reply to Finding largest common subset in lists?

This seems to work:
use strict; my @a = qw( fred bob joe jim mary elaine ); my @b = qw( frank joe jim mary bob ); my @result; while (@a and @a > @result) { my $start = 0; while (@b - $start > @result) { my $end = 0; my @maybe = (); $end++ while (exists($a[$end]) and $a[$end] eq $b[$start+$end] +); if (--$end > @result) { @result = @a[0..$end]; } $start++; } shift(@a); } print "@result\n";
Note that I am destroying @a !

Jenda
Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live.
   -- Rick Osborne

Edit by castaway: Closed small tag in signature

Replies are listed 'Best First'.
Re: Re: Finding largest common subset in lists?
by anjiro (Beadle) on Jun 06, 2003 at 04:34 UTC
    So close! It works almost all the time, but this test case fails:
    my @a = qw( a b c d e f g h i j k l m n o p q r s t u v w x y z ); my @b = qw( a b c X f g h X l m n X j k a b c d );
    Running on this, I get a b c as the result instead of a b c d. I'm looking to see if I can figure out why now, but I figured I'd post in case someone's faster than me (likely).

      I'm stupid, stupid, stupid. There should be:

      .... if ($end > @result) { @result = @a[0..$end-1]; } ...
      instead of
      ... if (--$end > @result) { @result = @a[0..$end]; } ...
      The way I have it I'd only update the @result if the newly found list was longer by at least 2 items. The usual off-by-one error :-(

      Jenda
      Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live.
         -- Rick Osborne

      Edit by castaway: Closed small tag in signature

Re: Re: Finding largest common subset in lists?
by Jenda (Abbot) on Jun 05, 2003 at 15:50 UTC

    Also if the lists are huge and a lot of the items only appear on one of the lists it may be better to filter out those "unique" elements before you start looking for matches. Of course you can't remove them completely, you have to keep a "marker" there that's not equal to any other element nor marker.

    use strict; my @a = qw(fred bob joe jim ethfgh mary elaine foo bar bob foo too); my @b = qw(frank joe jim dfkjhgdkjfg mary bob srere dfhgerg wet sdfwer + mary); my (%seenA, %seenB); { my $i = 0; @seenA{@a}=undef; @seenB{@b}=undef; my $last = -1; @a = map {exists $seenB{$_} ? ($last = $_) : (defined($last) ? ($l +ast = undef) : ())} @a; my $last = -1; @b = map {exists $seenA{$_} ? ($last = $_) : (defined($last) ? ($l +ast = undef) : ())} @b; } print "@a\n"; print "@b\n\n"; shift(@a) unless defined $a[0]; shift(@b) unless defined $b[0]; pop(@a) unless defined $a[-1]; pop(@b) unless defined $b[-1]; print "@a\n"; print "@b\n\n"; my @result; while (@a and @a > @result) { my $start = 0; while (@b - $start > @result) { my $end = 0; my @maybe = (); $end++ while (defined($a[$end]) and defined($b[$start+$end]) a +nd $a[$end] eq $b[$start+$end]); if (--$end > @result) { @result = @a[0..$end]; } $start++; } shift(@a); } print "@result\n";

    I assume the lists did not contain any undefs!

    Jenda
    Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live.
       -- Rick Osborne

    Edit by castaway: Closed small tag in signature