in reply to Tell whether two strings differ at only one position

This:

use strict; use warnings; # @data contains the (chomped, lower-cased) entries # from a 70K dictionary file (2076 words beginning with 'j') my $count; # Or whatever: this can check algorithm validity foreach my $item ( @data ) { compare( $item, $_ ) and $count++ for @data; } print $count; sub compare { return 0 unless length $_[0] == length $_[1]; my $diff = 0; for ( 0 .. length $_[0] ) { $diff++ if substr( $_[0], $_, 1 ) ne substr( $_[1], $_, 1 ); return 0 if $diff > 1; } return 1; }

is very much faster than your original code.

Surprisingly(?), initial testing seems to show that it is also considerably faster than any of the replies you have hitherto received using bitwise XOR.

I haven't got the time to do any precise benchmarking at the moment: however, since you need the "fastest solution on Earth", I shall leave that up to you

Replies are listed 'Best First'.
Re^2: Tell whether two strings differ at only one position
by rg0now (Chaplain) on Aug 04, 2005 at 21:14 UTC
    I adopted Albannach's nice benchmark (thanks Albannach) and reBenchmarked all the solutions again. But now, for data, I used a random set of nice Hungarian words. Here are the results:
                   Rate    rg0now  ikegami Not_a_Number BrowserUk   blazar     japhy
    rg0now        288/s        --     -66%         -82%      -95%     -96%      -97%
    ikegami       849/s      195%       --         -45%      -87%     -88%      -90%
    Not_a_Number 1557/s      441%      83%           --      -76%     -78%      -83%
    BrowserUk    6382/s     2118%     652%         310%        --     -10%      -28%
    blazar       7084/s     2362%     735%         355%       11%       --      -20%
    japhy        8900/s     2993%     949%         472%       39%      26%        --
    

    It seems that japhy's solution is the clear winner, even if it is degraded to a sub call. I think that your results might be attributed your specific choice of data, but I do not know enough about the intrinsics of Perl string handling to say the decisive words (in fact, I do not know anything about it, so...)

      Both blazar's and japhy's return true when passed 'fred', 'freda' which obviously isn't right.

      When you add the code to detect that error, you get:

      Rate rg0now ikegami Not_a_Number blazar japhy + BrowserUk rg0now 412/s -- -69% -85% -96% -96% + -96% ikegami 1311/s 218% -- -54% -86% -88% + -89% Not_a_Number 2842/s 589% 117% -- -70% -74% + -75% blazar 9628/s 2236% 634% 239% -- -13% + -17% japhy 11086/s 2590% 745% 290% 15% -- + -4% BrowserUk 11580/s 2709% 783% 307% 20% 4% + --

      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
      "Science is about questioning the status quo. Questioning authority".
      The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.

      Playing more than understanding. Using the same benchmarks I got about a 3-5% speed up with a modified version of blazar's which skips lexical assignment by using $_. Tried a few arrangements with length and local and this was the fastest I stumbled on. OS X, v5.8.2.

      sub compare_blazar_mod { ( $_ = $_[0] ^ $_[1] ) =~ tr/\0//d; 2 < length; }