I also went the route of creating a regular expression, and here is what I came up with. This RE most likely has abysmal exponential performance, as it backtracks extensively. It also does not work with duplicate elements in the input stream, as a (Perl) regular expression will be content with the leftmost match - had Perl a POSIX RE engine, that engine should match the longest match from what I remember...

Anyway, the code "simply" constructs a regular expression of one list that matches all possible subsequences in order, and then matches that list against the other list as a string. Spaces are used as the delimiters between the elements.

#!/usr/bin/perl -w use strict; use Test::More tests => 3; sub lcos { my ($list1,$list2) = @_; my @list1 = @$list1; my @list2 = @$list2; # Assume that neither list1 nor list2 contain elements # that stringify to the same value, that is, neither # list1 nor list2 contain both "" and undef. # Also, a blank is chosen as a delimiting character which # shall appear in no element of list1 or list2 $list2 = " " . join(" ", @list2) . " "; my $matcher = lcos_match(@list1); #print $matcher,"\n"; my @result; @result = split " ", $1 if $list2 =~ $matcher; @result; }; sub lcos_match { my $match = '('; while (@_) { $match .= match_sequence(@_); shift; @_ and $match .= "|"; }; $match .= ')'; $match; }; sub match_sequence { my $result = ""; my $element; while (defined ($element = pop)) { $result = $result ? qr{ \Q$element\E$result?} : qr{ \Q$element\E } +; }; $result; }; while (<DATA>) { my ($l1,$l2,$expected) = split /\s*\|\s*/; my @l1 = split /\s*/, $l1; my @l2 = split /\s*/, $l2; my @expected = split /\s*/, $expected; $" = ","; $, = ","; is_deeply( [ lcos(\@l1,\@l2)], \@expected, "@l1 | @l2"); }; __DATA__ a b c d f g h j q z | a b c d e f g i j k r x y z | a b c d a b c | a b x c | a b a b c d | a b x b c d | b c d

In reply to Re: Re: Finding largest common subset in lists? by Corion
in thread Finding largest common subset in lists? by anjiro

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.