in reply to Re: Re: Re: Optimizing a string processing sub
in thread Optimizing a string processing sub
I put some more thought into the exact trouble I have with dragonchild's second solution. It works in all cases where each unique character in word1 shows up fewer times or an equal numbers of times as they do in word2 (or vice versa). The reason is simple. It calculates the number of characters in word1 that are in word2, and then word2 in word1, and returns the lower of the two values. It does not consider the possibility that some characters will show up fewer times in word1, and some characters will show up fewer times in word2. (Example: "aabccc" and "abbbbc" yield 6 with this solution - very unexpected)
Rather than merely complaining, I will offer a solution:
sub score { my($word1, $word2) = @_; return $words_are_equal_score if $word1 eq $word2; my $score = 0; for ($word1 =~ /(.)/g) { $score++ if $word2 =~ s/\Q$_\E//; } $score; }
For the border case (the example) described above, this solution scores "aabccc" and "abbbbc" with the value 3.
The trick is that the "$word2 =~ s/\Q$_\E//" ensures that characters in $word2 will not be considered twice, because they are removed as they are counted. I will continue to think of a method of accelerating this function. I wanted to get one, fast, correct answer in this thread out first. :-)
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Re: Re: Re: Re: Optimizing a string processing sub
by sauoq (Abbot) on Jan 09, 2003 at 08:31 UTC | |
|
Re5: Optimizing a string processing sub
by dragonchild (Archbishop) on Jan 09, 2003 at 15:11 UTC |