in reply to Some of the above suggestions, benchmarked (Re: Similarity of strings)
in thread Similarity of strings


I'm glad that someone benchmarked this.

However, you were a little bit unfair to the chop method. :-) The scalar reverse and array assignments aren't necessary. The following is 5 times faster (although still 5 times slower than the xor method):

sub chop2 { my $str1 = $string1; my $str2 = $string2; my $length = length $string1; my $score; $score += (chop $str1 eq chop $str2) while $str1; return $score/$length; }

Update: Albannach points out that because the strings in this test are not of equal length, the reverse is required. My code was based on the original sample data.

Also, it is worth adding that the speed of the xor method is less dependent on the string length than the other methods.

--
John.

  • Comment on Re: Some of the above suggestions, benchmarked (Re: Similarity of strings)
  • Download Code