in reply to finding longest common substring
The code below is broken! Please see Re: Re: Re: finding longest common substring (ALL common substrings) for details, and the update at the bottom for a couple of 'fixed' versions.
This will never win the "fastest longest common substring" accolade, but it is interesting in that in a list context, it returns a list of all common substring sorted by length (longest first).
I was also surprised how simple it was to code, and fairly surprised by how efficient it was given what it does.
sub lcs{ our %subs = (); my $n = @_; shift =~ m[^.*(.+)(?{ $subs{ $^N }++ })(?!)] while @_; my @subs = sort{ length $b <=> length $a } grep{ $subs{ $_ } == $n } keys %subs; return wantarray ? @subs : $subs[ 0 ]; }
Update: The following two versions work, in as much as they will return the longest common substring if called in a scalar context. They will also return all common substrings (ordered longest to shortest) when called in a list context. As lcs routines, they are both slow, with lcs3() being marginally quicker than lcs2(). I'm not sure how they compare performance-wise with other mechanism for generating all common substrings.
As implemented, they also do not preserve the value of two (unavoidable?) globals %subs & $n. This could be fixed by judicious use of local if it is of concern.
sub lcs2{ our %subs = (); my $selector = ''; for our $n ( 0 .. $#_ ) { vec( $selector, $n, 1 ) = 1; $_[ $n ] =~ m[^.*?(.+?)(?{ $subs{ $^N } = '' unless exists $subs{ $^N }; vec( $subs{ $^N }, $n, 1 ) = 1 })(?!)]; } return wantarray ? sort{ length $b <=> length $a } grep{ $subs{ $_ } eq $selector } keys %subs : reduce{ length $a > length $b ? $a : $b } grep{ $subs{ $_ } eq $selector } keys %subs; } sub lcs3{ our %subs = (); my $selector = ' ' x @_; for our $n ( 0 .. $#_ ) { substr( $selector, $n, 1, '1' ); $_[ $n ] =~ m[^.*?(.+?)(?{ $subs{ $^N } = ' ' x @_ unless exists $subs{ $^N }; substr( $subs{ $^N }, $n, 1, '1' ); })(?!)]; } return wantarray ? sort{ length $b <=> length $a } grep{ $subs{ $_ } eq $selector } keys %subs : reduce{ length $a > length $b ? $a : $b } grep{ $subs{ $_ } eq $selector } keys %subs; }
Whether the above code has any merits I'm not sure, but it's here should anyone find a good use for it.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Re: finding longest common substring (ALL common substrings)
by revdiablo (Prior) on Nov 20, 2003 at 17:55 UTC | |
by BrowserUk (Patriarch) on Nov 20, 2003 at 22:29 UTC | |
Re: Re: finding longest common substring (ALL common substrings)
by welchavw (Pilgrim) on Nov 20, 2003 at 20:43 UTC | |
by BrowserUk (Patriarch) on Nov 20, 2003 at 22:31 UTC |