in reply to Re^2: Finding a LCS module on word level
in thread Finding a LCS module on word level

Ah, so we have to examine the powerset of substrings and not just the stem. Ok, that's just a change to the loop. I think my point still stands, though.

My criteria for good software:
  1. Does it work?
  2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
  • Comment on Re^3: Finding a LCS module on word level

Replies are listed 'Best First'.
Re^4: Finding a LCS module on word level
by Limbic~Region (Chancellor) on May 10, 2008 at 00:41 UTC
    dragonchild,
    I didn't review your logic other than to see you are using index. I am fairly certain that is a flawed approach since the OP is looking for the longest common substring based on words not characters.
    Have you ever seen the rain I have never seen the rain # should produce "seen the rain" not "ever seen the rain" because "eve +r" ne "never"

    Cheers - L~R

      Good point. Leaving aside issues of "What constitutes a word", this process seems like it would boil down to:
      sub lcs { my ($string, $find) = @_; $string = join ' ', split ' ', $string; my @f = split ' ', $find; my ($starting_point, $max_length, $substr) = (0,0, ''); foreach my $start ( 0 .. $#f - 1 ) { foreach my $len ( 1 .. $#f - $start - 1 ) { my $search = join ' ', @f[$start .. $start + $len ]; if ( $string =~ $search ) { if ( $len > $max_length ) { $max_length = $len; $start_point = $start; $substr = $search; } } } } return $search; }

      My criteria for good software:
      1. Does it work?
      2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?