in reply to Re: Difference Of Two Strings
in thread Difference Of Two Strings

Winner by a nose! Thanks merlyn.

Actually it's a lot better. When I saw how quickly this will return undef, I realized my benchmarking sucks. I have been benching only on successful matches.

On success:

l_merlyn: 16 wallclock secs (15.60 usr + 0.00 sys = 15.60 CPU) @ 6564.10/s (n=102400)
leftover: 16 wallclock secs (15.74 usr + 0.00 sys = 15.74 CPU) @ 6505.72/s (n=102400)

On failure:

l_merlyn: 1 wallclock secs ( 1.11 usr + 0.00 sys = 1.11 CPU) @ 9225.23/s (n=10240)
leftover: 45 wallclock secs (45.12 usr + 0.00 sys = 45.12 CPU) @ 226.95/s (n=10240)

I should have remembered how slow !~ m// can be.
Definitely the optimization I was looking for!

Replies are listed 'Best First'.
Re: Re: Re: Difference Of Two Strings
by merlyn (Sage) on Nov 03, 2001 at 22:58 UTC
    You might also try this one for speed, which avoids building the list unnecessarily, and uses arrays instead of hashes to avoid the hashing algorithm for just a single character:
    sub leftover { my ($full, $part) = @_; my @count; $count[ord $1]++ while $full =~ /(\w)/g; while ($part =~ /(\w)/g) { return undef if --$count[ord $1] < 0; } my $return; $count[$_] and $return .= chr($_) x $count[$_] for 0..$#count; $return; }

    -- Randal L. Schwartz, Perl hacker

      Thanks again merlyn. This one (l_merlyn2) was considerably slower than your original. A bit of it was the change to regex instead of split to process the strings. Mostly it was more expensive to build the return string. Since I only allow a..z, I changed for 0..$#count; to for (ord('a')..$#count); These changes (l_merlyn3) made it faster than the original.

      l_merlyn: 16 wallclock secs (15.46 usr + 0.00 sys = 15.46 CPU) @ 6623.54/s (n=102400)
      l_merlyn2: 24 wallclock secs (24.14 usr + 0.00 sys = 24.14 CPU) @ 4241.92/s (n=102400)
      l_merlyn3: 15 wallclock secs (14.48 usr + 0.00 sys = 14.48 CPU) @ 7071.82/s (n=102400)