Re^2: True Brute Force Longest Common Sub Sequence

If you just want to use brute force then using a hash is really cheating as it is already the begining of optimisation. We need nasty embedded loops to go O(2^n*2^m). I think something like this represents to worst way to do it, first generating all possible subsequences and then testing every case. This does however seem to be the point.

use strict;

sub subseqs {
   my $str = $_[0];
   my @seqs = ('');
   for my $char (split //, $str) {
       for my $i ( 0..@seqs-1 ) {
           push @seqs, $seqs[$i].$char;
       }
   }
   return @seqs;
}

my $s1 = 'abcdefgh';
my $s2 = 'abecdgh';
my $best_len = 0;
my $longest;
for my $seq1 (subseqs($s1)) {
    for my $seq2 (subseqs($s2)) {
        next unless $seq1 eq $seq2;
        if (length($seq1) > $best_len) {
            $longest = $seq1;
            $best_len = length($seq1);
        } 
    }
}
print $longest;
[download]

Note that there is an edge case when 2 or more common subsequences share the same length only the first will be found.

Comment on Re^2: True Brute Force Longest Common Sub Sequence Download Code

Replies are listed 'Best First'.
Re^3: True Brute Force Longest Common Sub Sequence by ikegami (Patriarch) on Nov 25, 2007 at 18:14 UTC
If you just want to use brute force then using a hash is really cheating as it is already the begining of optimisation. ~~Have another look. That claim just isn't true.~~ ~~`common` still visits every single combination of inputs, so it's brute force. Said nasty embedded loops are there and the hash doesn't affect the complexity.~~ ~~`common` uses a hash to remove dups, something that isn't even necessary. It actually slows things down, if anything.~~ ~~Even with the optimized `common`, said nasty embedded loops are there overall and the using the optimized `common` doesn't affect the overall complexity.~~	[reply] [d/l] [select]
Re^4: True Brute Force Longest Common Sub Sequence by tachyon-II (Chaplain) on Nov 26, 2007 at 00:31 UTC
Even with the optimized common, said nasty embedded loops are there* overall and the using the optimized common doesn't affect the overall complexity.* Actually in the optimised version there are no nested loops and the overall complexity is greatly reduced. The non-optimised version certainly has embedded loops but as you show in the optimised version there is not any need for them because you are using a hash.	[reply]
Re^5: True Brute Force Longest Common Sub Sequence by ikegami (Patriarch) on Nov 26, 2007 at 06:35 UTC
~~I strongly disagree. Not only does the optimized `common` still requires the nested loops to find the LCS, worse/average O(N^2M^2 + M+N + M+N) is still O(N^2M^2).~~	[reply] [d/l]
Re^6: True Brute Force Longest Common Sub Sequence by tachyon-II (Chaplain) on Nov 28, 2007 at 00:06 UTC
Re^7: True Brute Force Longest Common Sub Sequence by ikegami (Patriarch) on Nov 28, 2007 at 01:13 UTC
Re^3: True Brute Force Longest Common Sub Sequence by ikegami (Patriarch) on Nov 28, 2007 at 01:56 UTC
Your analysis doesn't take into account `$seqs[$i] . $char` and `$seq1 eq $seq2`. They are both based on N and/or M. `$seqs[$i] . $char` won't affect the final O() because it's higher up in the loop, but `$seq1 eq $seq2` is in the innermost loop. `for my $seq1 (subseqs($s1)) { for my $seq2 (subseqs($s2)) {` [download] could be written as `my @s1 = subseqs($s1); my @s2 = subseqs($s2); for my $seq1 (@s1) { for my $seq2 (@s2) {` [download] for faster code of the same complexity.	[reply] [d/l] [select]