Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello, o wise ones.

Disclaimer in advance: This is homework. The task is to determine whether a list is a sublist of another (sub in_list()). Order matters, these are real lists, not sets. I coded a solution, see below.

Can you think of any other approaches, possibly simpler and without using LCS?

use Test::More tests => 6; use List::MoreUtils qw(each_arrayref); use Algorithm::Diff qw(LCS); sub _identic { # Returns true is two lists are identic. my ($lsa, $lsb) = @_; return unless scalar @{$lsa} == scalar @{$lsb}; # bail if unequ +al length my $iterator = each_arrayref($lsa, $lsb); while (my ($first, $second) = $iterator->()) { return unless $first eq $second; # bail if two elements at the same position differ } # at here, all elements are pairwise identic return 1; } sub in_list { # Returns true if lsa is a proper subsequence of lsb my ($lsa, $lsb) = @_; my $LCS = LCS($lsa, $lsb); return _identic($LCS, $lsa); } my (@lsa, @lsb); @lsa = (1, 2, 3); @lsb = (1, 2, 3, 4, 5); ok in_list(\@lsa, \@lsb), 'partial list at start'; @lsa = (1, 2, 3); @lsb = (2, 1, 2, 3); ok in_list(\@lsa, \@lsb), 'partial list at end'; @lsa = (1, 2, 3); @lsb = (3, 2, 1); ok !in_list(\@lsa, \@lsb), 'same elements, but nothing in common due t +o order'; @lsa = (); @lsb = (); ok in_list(\@lsa, \@lsb), 'null list, identic'; @lsa = (1); @lsb = (2); ok !in_list(\@lsa, \@lsb), 'single element, different'; @lsa = (1); @lsb = (1); ok in_list(\@lsa, \@lsb), 'single element, identic';

Replies are listed 'Best First'.
Re: sublist of a list
by Corion (Patriarch) on Nov 02, 2008 at 16:20 UTC

    If order matters, your subroutine in_list can simply check if the first element of list A ($lsa->[0]) was equal to the first element of list B ($lsb->[0]). If so, then the remainder of list A must be a subset of the remainder of list B. If not, then list A must be a subset of the remainder of list B. If list A is empty, it is (per definition) a subset of list B. If list B is empty, then list A cannot be a subset (unless it is empty itself).

    The above description (if it's correct which I leave to you to check) would lead to the following algorithm, which destroys both lists:

    sub in_list_recursive { my ($lsa,$lsb) = @_; if (@$lsa == 0) { return 1; # yay } elsif (@$lsb == 0) { return; # nay } elsif ($lsa->[0] == $lsb->[0]) { shift @$lsa; shift @$lsb; return in_list_recursive( $lsa, $lsb ); } else { shift @$lsb; return in_list_recursive( $lsa, $lsb ); }; };

    This algorithm is far easier written using a loop variable and a nondestructive approach to the lists, but the conversion is left to you.

      Thanks a lot :)
Re: sublist of a list
by ikegami (Patriarch) on Nov 02, 2008 at 21:14 UTC
    This should work too:
    use Algorithm::Diff qw( LCS_length ); sub in_list { # Returns true if lsa is a proper subsequence of lsb my ($lsa, $lsb) = @_; return LCS_length($lsa, $lsb) == @$lsa; }

    And a module-less solution:

    sub in_list { # Returns true if lsa is a proper subsequence of lsb my ($lsa, $lsb) = @_; my $ia = 0; my $ib = 0; for (;;) { return 1 if $ia == @$lsa; return 0 if $ib == @$lsb; ++( $lsa->[$ia] eq $lsb->[$ib] ? $ia : $ib ); } }

    Update: Added second solution.
    Update: Sorry, I missed the bit about this being homework.

Re: sublist of a list
by kyle (Abbot) on Nov 02, 2008 at 21:18 UTC

    It's cheap, but the first thing I thought of was to serialize the lists and use index to see if one is a substring of the other. Something like...

    sub serialize { my @copy = @_; foreach my $o ( @copy ) { $o =~ s/\\/\\\\/g; # escape escape char $o =~ s/,/\\,/g; # escape delimiter } return join q{,}, @copy; } sub in_list { my ( $la_ref, $lb_ref ) = @_; my @la = @{$la_ref}; my @lb = @{$lb_ref}; return ( !@la || 0 <= index serialize( @lb ), serialize( @la ) ); }

      Two bugs.

      in_list([','], ['a\\', 'b']) returns true. It should return false.

      in_list([qw( 1 2 4 )], [qw( 1 2 3 4 )]) returns false. It should return true since the OP is searching for a subsequence, not a substring.

      Update: My fix for the first bug was equally buggy. Removed.

Re: sublist of a list
by dragonchild (Archbishop) on Nov 03, 2008 at 12:45 UTC
    Note: I'm not going to bother testing these because I'm just illustrating an approach. In essence, is $l1 an ordered subset of $l2?
    use List::MoreUtils qw( all first_index indexes ); # Is $l1 in $l2? It's kinda a bit destructive. sub in_list { my ($l1, $l2) = @_; my $head = shift @$l1; my $head_idx = first_index { $head eq $_ } @$l2; return unless defined $head_idx; # Make sure $l2 is still big enough to hold $l1 return unless $#$l2 - $head_idx >= $#$l1; return all { $l1->[$_] eq $l2->[$_ + $head_idx] } indexes @$l1; }
    The best solution, however, will allow for an optional equality test. You never know what kind of things are in those arrays. Oh, and we shouldn't be destructive.
    sub in_list { my ($l1, $l2, $comp) = @_; $comp ||= sub { $_[0] eq $_[1] }; my $head = $l1->[0]; my $head_idx = first_index { $comp->($head,$_) } @$l2; return unless defined $head_idx; # Make sure $l2 is still big enough to hold $l1 return unless $#$l2 - $head_idx >= $#$l1 - 1; # It might be cheaper to re-compare the head elements. But, since +$comp is potentially # client-defined, we'd best not assume anything. $comp might not e +ven be side-effect # free!! *gasps* return all { $comp->($l1->[$_ + 1], $l2->[$_ + $head_idx]) } grep +{ $_ } indexes @$l1; }

    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?