To find the longest substring shared by two strings.
sub longest_common_substr { # provided you know there are no NULs my $str = join "\0", @_; my $len = 1; my $match; while ($str =~ m{ ([^\0]{$len,}) (?= [^\0]* \0 [^\0]*? \1 ) }xg) { $len = length($match = $1) + 1; } return $match; }

Replies are listed 'Best First'.
Re: Longest common substring
by blakem (Monsignor) on Feb 16, 2002 at 00:29 UTC
    At the risk of getting rebuked again while commenting on a japhy regex ;-P

    Overlapping matches seem to cause a problem... For example, the last four characters in 'abcabc' and 'caWcabc' match, yet the function only returns the last three.

    print longest_common_substr('abcabc','caWcabc'); # 'abc' not 'cabc'
    I think it might be as simple as removing the /g (but thats the part I don't fully comprehend....) The code forces each match to be bigger than the previous one, with greedy matching helping us rachet up several steps at a time. The /g might be doing something else tricky, but I don't see it.

    Update: Other word pairs that fail similarly are sense/tense and onion/union.

    -Blake