in reply to LCS efficiency problem
Pardon my abruptness in advance. There's a lot of ground to cover.
There's an off-by-one error. The code on which you based your implementation assumed @S and @T were indexed 1..m and 1..n respectively.
There's three bugs in $S[$i] =~ m/^$T[$j]$/ig alone.
What you get is $S[$i] =~ m/^\Q$T[$j]\E\z/i. Better yet, make all the strings lowercase and use eq.
$S[$i] ne undef doesn't check if $S[$i] is defined. defined($S[$i]) does. It seems weird that you should have to check for that at all. I'm inclined to believe you have a bug if you need this check.
For that matter, having to check for $S[$i] ne " " and $T[$j] ne " " means something is wrong too. " " isn't a word, so it shouldn't be in @S and @T. That leads us to the next point...
split(/\s+/, $s) is surely better than split(/ /, $s). Even then, it's not very good. I think I'd favour something more like $s =~ /\w+/g which wouldn't include punctuation at the end of words. (Actually, it wouldn't include punctuation at all.)
@prod's not needed. Since you have the start and the end of the substring, you can use a slice (@S[$start..$end]).
It seems that you forgot to clean out @ret = ();. How come you're not using use strict;???
It seems that you forgot to use my on $num and @L. How come you're not using use strict;???
You use the same variable ($n) for two different things. That's a bad idea.
This one is mostly stylistic, but
my $num = scalar @start; for (my $n = 0; $n < $num;$n++)
is much more readable and maintainable when written as
for my $n (0 .. $#start)
That's a really weird return value. But given your need for speed, it might actually make sense. It *might* help to return a reference to @LCS instead of @LCS itself.
Those are bugs to fix, not performance enhancements (except the tip to make everything lowercase). I didn't even try running the program,
Update: Changed @s[$start .. $end-$start+1] to @S[$start..$end]
Update: Added off-by-one error.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: LCS efficiency problem
by BrowserUk (Patriarch) on Jun 05, 2008 at 06:44 UTC | |
by ikegami (Patriarch) on Jun 05, 2008 at 07:46 UTC | |
|
Re^2: LCS efficiency problem
by lima1 (Curate) on Jun 06, 2008 at 07:13 UTC | |
by ikegami (Patriarch) on Jun 06, 2008 at 18:31 UTC |