comment on

Hello monks,

I have been searching far and wide for a true brute force code or algorithm to find the longest common subsequence between two strings for my research project and have not been so lucky. I have seen a few code here in Perlmonks but they are not truely brute force so it will make my measurements inaccurate, so I am turning to Perl monks for some wisdoms. I read an article on the web and it said that for two strings S1 and S2, assuming S1 is the longer string, I need to enumerate all possible subsequences in S1. Let's assume S1="abcde" and S2="abe" then to enumerate all possible subsequences of S1, what do I need to do? My first guess would be to use the split function to split S1 into individual character but then unsure of what to do next. Any helps would be appreciated.
Thanks in advance.

Also, I found the following algorithm on the web and it claims to be brute force but I am having a hard time understand what it actually do to implement it in Perl so would someone please explain it.

lcs is longest common subsequence algorithm
lcs algorithm in c
brute-force approach
/ input: array a of x/
/ input: array b of x/
/ output: pointer to beginning of longest sequence/
x * c[];
int a_counter;
int b_counter;
int c_counter;
exists bool;
int longest_sequence;
int longest_sequence_index;
/ generate array of unique elements from a called c/
for (a_counter = a_start; a_counter < size_of_a; a_counter++) {

        c_counter = c_start;
        while ( c_counter < size_of_c ) {
                if (c[c_counter] == a[a_counter] ) {
                        exists = true;
                        break;
                }
                c_counter++;
        }
        if (exists) continue;
        else c[a_counter++] = a[a_counter];
        exists = false;

}
/ generate max_c array to hold max entries of c array/
/* (could also implement d as linked list)*/
int max_c[size_of_c];
for ( c_counter = c_start; c_counter < size_of_c; c_counter++) {

max_c[c_counter] = 0;
}
/ compare a[i] with ending elements of each c[0 ... size_of_c].final/
/ elements/
for (b_counter = b_start+1; b_counter <= size_of_b; b_counter++) {

        for (c_counter = c_start; c_counter < size_of_c; c_counter++) 
+{
                if (c[c_counter][max_c[c_counter]] == b[b_counter-1]) 
+{
                        c[c_counter][max_c[c_counter]] = b[b_counter];
                        max_c[c_counter]++;
                }
        }

}
/* find longest sequence */
int longest_sequence;
for (c_counter = c_start; c_counter < size_of_c; c_counter++) {

        if (longest_sequence < max_c[c_counter]) {
                longest_sequence = max_c[c_counter];    
                longest_sequence_index = c_counter;     
        }

}
return c[longest_sequence_index];
[download]

In reply to True Brute Force Longest Common Sub Sequence by Anonymous Monk

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.