in reply to Re^2: Substring Distance Problem
in thread Measuring Substrings Problem
If I may tinker with Roy's code, here is the fix you need. I give the modification for both the original and the xor versions of the algorithm:
use strict; sub score { my ($str, $array) = @_; my $vec = ''; for (@$array) { my $ofs = 0; while ( ( my $idx = index $str, $_, $ofs ) > -1 ) { # Set bits at each matched location vec($vec, $_, 1)= 1 for $idx..$idx+length($_)-1; $ofs = $idx + 1; } } # Count set bits unpack '%32b*', $vec; } sub score_xor { my ($str, $array) = @_; my $vec = "\0" x length($str); for (@$array) { my $ofs = 0; while ( ( my $idx = index $str, $_, $ofs ) > -1 ) { # Matching substrings are padded into position with nulls $vec |= ("\0" x $idx) . $_; $ofs = $idx + 1; } } # Matching characters become nulls; others non-nulls $vec ^= $str; # Count nulls $vec =~ tr/\0//; }
I was curious to see how these two versions compared, and was surprised to learn that the original one is faster by well over a factor of 2:
Rate xor v1 xor 112734/s -- -59% v1 275692/s 145% --
the lowliest monk
|
|---|