in reply to LCS efficiency problem

I've seen some very impressive Perl code in the answers so far. I agree that the best answer to the OP is use a better algorithm.

I'm not up to coding a better answer. But I suspect that the answers so far could be speeded up further by using the 'jump tables' trick in the Boyer Moore String Search algorithm.

Replies are listed 'Best First'.
Re^2: LCS efficiency problem
by zhe (Initiate) on Jun 10, 2008 at 06:58 UTC
    Hi, all, Thanks very much for all your replies.Learned a lot from you guys.
    I have two files,each of them is sentences.For example file1 has sentences:
    A C D E G F D
    C B F D E A
    A A C F D B A
    file2 has sentences:
    A C D
    F D B
    F D D
    the result would be for each of the sentence in file 1 matching each of the sentence in file 2, return the longest common substring and their index on a word level between the two sentences.

    RESULT:
    for sentence 1 in file 1
    LCS :A C D0,3 F D5,6 F D5,6
    Original Sentence:A C D E G F D

    sentence 2 in file 1
    LCS :A5,5 C0,0 D3,3 F D3,4 F D 3,4
    Original Sentence:C B F D E A

    sentence 3 in file 1
    LCS :A C1,2 F D B3,5 F D3,4
    Original Sentence:A A C F D B A

    So is there any way to make this process more efficiency?