in reply to Longest Common Subsequence
This allows for any number of strings with only minor tweaking, but it does assume the conditions given in your original problem regarding all characters being present once and only once. Variable length strings won't be difficult, once the other conditions are met, but repeated chars will add a bit of complexity, and absent chars are a major pain if you're working with multiple strings. For instance, what if you have three strings:use strict; use warnings; my (@first, @pos, %chain, $c, $max, $mstring); chomp($_ = <DATA>); @first = split //, $_; while (<DATA>) { chomp; my %hash; @_ = split //, $_; $hash{$_[$_]} = $_ for 0..$#_; push @pos, \%hash; } for $c (@first) { for (@pos) { push @{$chain{$c}}, $_->{$c}; } } $max = 0; $mstring = ''; find('', 0, 0, 0, 0, 0, 0, 0, 0, 0, 0); print $mstring; sub find { my ($s, @p) = @_; my ($end, $cp, $c) = 1; CHAR: for $cp ($p[0]..35) { my @np = ($cp+1); $c = $first[$cp]; for (0..8) { next CHAR if $chain{$c}[$_] < $p[$_+1]; $np[$_+1] = $chain{$c}[$_]; } find($s.$c, @np); $end = 0; } if ($end && length($s) > $max) { $max = length($s); $mstring = $s; } } __DATA__ CPD6Z98SB2KQNWV0F7Y1IX4GLRA5MTOJHE3U CXZOL6SUI2WTJ30HF519YPGBRNAK48MQVD7E T8COSQU6I2FJN40DKL157WVGPYXARZ3MBHE9 KNCWVZDSR5420LP91FIQGB7Y3A6J8MOUXTEH XF9C4PSDY62TWJ0QBN17IKG3OH8ALVRM5UEZ D9QCHUSN7TW2YZL0O831FGXIR6JA4P5MVBKE ZC7ISQUPK6N20OLV4T31G9FRXBAWM5YJHED8 Z3C7SJVODL25TRQ01HPWGNKXB4UA68YMI9EF BC9OXDHS2FI5Z6U0TYL1VPGQK7ANR38MEWJ4 K4TCQBHS2ZV7FXU0P8R1YGDON3A6JILM9EW5
BCDE
ACDE
ABDE
ABCD
Looking at this, you might come to the conclusion that the sequence is supposed to be ABCDE. After all, each string is only missing one character, right? But how do you compute that automatically? If you base from string 1, you can pass over the missing B in string 2, the missing C in string 3, and the missing E in string 4. But you don't notice the missing A in string 1. You get the same problem basing from any other string - one character is always lost. I still can't figure out how to work around this problem, and if I can't figure it out on paper, I certainly can't program a solution.
|
---|