Re: Optimizing a string processing sub

Not sure if this is faster, but it might be, as it doesn't use nested loops.

sub score {
    my ($word1, $word2) = @_;
    my (%chars1, %chars2) = ();

    $chars1{$_}++ for split '', $word1;
    $chars2{$_}++ for split '', $word2;

    # the minimum of the two hashes is the number in common for each l
+etter
    my $sum = 0;
    $sum += ($chars1{$_} < $chars2{$_}
               ? $chars1{$_}
               : $chars2{$_})   for keys %chars1;

    return $sum;
}

while (<DATA>) {
    chomp;
    print "$_: " . score(split /\s+/) . " in common\n";
}

__DATA__
perl monk
help temp
frood hoopy
bilbo baggins
jibber jaber
[download]

This prints:

perl monk: 0 in common
help temp: 2 in common
frood hoopy: 2 in common
bilbo baggins: 2 in common
jibber jabber: 5 in common
[download]

I notice your algorithms give 3 matches for 'bilbo' and 'baggins'. I think this is because both 'b's in bilbo match inside baggins. I'm not sure if this is correct behavior by your specifications or not.

Update: To speed up your score2 sub, consider using index($word2, $a) > 0 instead of the regex match. Changing this alone made it approximately as fast as your initial score sub for me.

blokhead

Comment on Re: Optimizing a string processing sub Select or Download Code