Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?

Re: Longest Common Substring

by BrowserUk (Patriarch)
on Apr 09, 2003 at 16:24 UTC ( #249286=note: print w/replies, xml ) Need Help??

in reply to Longest Common Substring

Probably not the most efficient implementation (and maybe incomplete), but if you can't get the modules to work, you could wrap this up as a subroutine.

#! perl -slw use strict; use Data::Dumper; my $needle = 'the lazy cat'; my $haystack = 'the quick brown fox jumps over the lazy dog'; my @matches; for my $start (0..length $needle) { for my $len ($start+1 .. length $needle) { my $substr = substr($needle, $start, $len); push @matches, $haystack =~ m[($substr)]g; } } my ($len, $longest) = 0; length > $len and ($longest, $len) = ($_, length) for @matches; print "The longest common substring between\n", $needle, "\nand\n", $h +aystack, "\nis '$longest'"; __END__ C:\test> The longest common substring between the lazy cat and the quick brown fox jumps over the lazy dog is 'the lazy ' C:\test>

Examine what is said, not who speaks.
1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
3) Any sufficiently advanced technology is indistinguishable from magic.
Arthur C. Clarke.

Replies are listed 'Best First'.
Re: Re: Longest Common Substring
by Enlil (Parson) on Apr 09, 2003 at 19:15 UTC
    BrowserUK's solution doesn't quite work I tried the following and did not get back what was expected ( " the quick brown fox "):
    my $needle = 'i pushed the lazy dog into a creek, the quick brown fox +told me to'; my $haystack = 'the quick brown fox jumps over the lazy dog';
    and got back 'the lazy dog' (it works if you reverse the $needle and the $haystack.

    But to his credit it seems to work better than String-LCSS-0.10, and he did mention that it might be incomplete (run it both ways, get the longest string and print that, I believe should give you the most right answer, but my tests might also be incomplete).

    Where which running the following code:

    I would use Algorithm-Diff using this part of the documentation as a starting point.

    Note: The Author of String-LCSS-0.10 has been notified.


Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://249286]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (5)
As of 2023-02-08 06:31 GMT
Find Nodes?
    Voting Booth?
    I prefer not to run the latest version of Perl because:

    Results (40 votes). Check out past polls.