Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic

Re: Finding largest common subset in lists?

by broquaint (Abbot)
on Jun 05, 2003 at 09:45 UTC ( #263260=note: print w/replies, xml ) Need Help??

in reply to Finding largest common subset in lists?

Update: fixed code to work with zby's case and hopefully any other case.
Update: code won't work with duplicates in the second list

Not exactly vastly mystical but this should do the trick (although it hasn't been thoroughly tested)

use strict; my @a = qw/ fred bob joe jim mary elaine /; my @b = qw/ frank joe jim mary bob /; print "LCS[anjiro] - ", join($", get_lcs(\@a, \@b)), $/; @a = qw/ a b c /; @b = qw/ a b x c /; print "LCS[zby] - ", join($", get_lcs(\@a, \@b)), $/; sub get_lcs { my @a = @{ +shift }; my @b = @{ +shift }; my %map = map { $b[$_] => $_ } 0 .. $#b; my(@lcs, @tmp); for(0 .. $#a) { next unless exists $map{$a[$_]} or $a[$_ + 1] eq $b[$map{$a[$_]} + 1]; push @tmp, $a[$_] if $a[$_] eq $b[$map{$a[$_]}]; if($a[$_ + 1] ne $b[$map{$a[$_]} + 1]) { @lcs = @tmp if @tmp > @lcs; @tmp = (); } } @lcs = @tmp if @tmp > @lcs; return @lcs; } __output__ LCS[anjiro] - joe jim mary LCS[zby] - a b
You might also find some roughly applicable questions under Longest Common Substring.


Replies are listed 'Best First'.
Re: Re: Finding largest common subset in lists?
by Aragorn (Curate) on Jun 05, 2003 at 10:21 UTC
    With use warnings it gives a Use of uninitialized value in string eq at... warning. Small off-by-one error.


    push @tmp, $a[$_] if ( $a[$_ + 1]) eq ( $b[$map{$a[$_]} + 1] ) or @tmp >= 1;
    push @tmp, $a[$_] if ($a[$_ + 1] and $b[$map{$a[$_]} + 1] and ( $a[$_ + 1]) eq ( $b[$map{$a[$_]} + 1] ) or @tmp >= 1);
    makes Perl happy again.


Re: Re: Finding largest common subset in lists?
by zby (Vicar) on Jun 05, 2003 at 10:58 UTC
    I just felt it was too simple. For @a = qw(a b c); @b = qw(a b x c) it prints LCS - a b c.
Re: Re: Finding largest common subset in lists?
by zby (Vicar) on Jun 05, 2003 at 12:44 UTC
    The updated code does not work for  @a = qw(a b c d); @b = qw(a b x b c d). It prints a b c d. The output is suprising for me - I created the inputs to catch another kind of mistake. I believe you can't do it in one sweep.
      The output is due to the fact that 'b' is repeated in @b so the offset in %map is for the second 'b'. I think I'll leave the code as it is for the time being and just add a caveat that it won't work when there are duplicates in @b. Thanks once again :)


        It was aimed at duplicates, but I did not take into account the details - just the general algorithm. I believe you can not do it in one sweep when there are duplicates.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://263260]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (4)
As of 2022-11-29 02:31 GMT
Find Nodes?
    Voting Booth?