in reply to Optimizing a string processing sub

Do you mean in the same position, or just the same characters? Your examples suggest the former, but your code the latter. Well, for the former, simple enough, this would be my idea:
($word1 ^ $word2) =~ tr/\0//
Do a bitwise XOR of the words. There where they have the same letters, the resulting character will be a "\0". Next, count these occurrences.

As for the latter, even though it looks simpler as a requirement, as an implementation, it is not. So I'll take a completely different approach.

my %bag; my $total = 0; foreach(split '', $word1) { $bag{$_}++; } foreach(split '', $word2) { if($bag{$_}) { $bag{$_}--; $total++; } } return $total;
In English: cut the first word into characters. Put them in a bag. For each character in the second word, pick it out of the bag, at least, if there's at least one of it left in the bag. Count the number of characters taken from the bag.

Replies are listed 'Best First'.
Re: Re: Optimizing a string processing sub
by spurperl (Priest) on Jan 09, 2003 at 06:00 UTC
    I meant the latter, and your solution works well (see my 2nd reply to dragonchild above for some clarifications).
    It also feels "right". The problem is when some char X appears in word2 and then twice in word1... Using a hash is a clever solution to this problem. I've been trying to run your algorithm on some examples, and in my head, and didn't break it yet.
    It also works about 50% faster than my original version.