in reply to Difference Of Two Strings

Untested for speed, but it's functional:
sub leftover { my ($full, $part) = @_; my %count; $count{$_}++ for split //, $full; for (split //, $part) { return undef if --$count{$_} < 0; } return join "", map { $_ x $count{$_} } keys %count; }

-- Randal L. Schwartz, Perl hacker

Replies are listed 'Best First'.
Re: Re: Difference Of Two Strings
by YuckFoo (Abbot) on Nov 03, 2001 at 05:39 UTC
    Winner by a nose! Thanks merlyn.

    Actually it's a lot better. When I saw how quickly this will return undef, I realized my benchmarking sucks. I have been benching only on successful matches.

    On success:

    l_merlyn: 16 wallclock secs (15.60 usr + 0.00 sys = 15.60 CPU) @ 6564.10/s (n=102400)
    leftover: 16 wallclock secs (15.74 usr + 0.00 sys = 15.74 CPU) @ 6505.72/s (n=102400)

    On failure:

    l_merlyn: 1 wallclock secs ( 1.11 usr + 0.00 sys = 1.11 CPU) @ 9225.23/s (n=10240)
    leftover: 45 wallclock secs (45.12 usr + 0.00 sys = 45.12 CPU) @ 226.95/s (n=10240)

    I should have remembered how slow !~ m// can be.
    Definitely the optimization I was looking for!

      You might also try this one for speed, which avoids building the list unnecessarily, and uses arrays instead of hashes to avoid the hashing algorithm for just a single character:
      sub leftover { my ($full, $part) = @_; my @count; $count[ord $1]++ while $full =~ /(\w)/g; while ($part =~ /(\w)/g) { return undef if --$count[ord $1] < 0; } my $return; $count[$_] and $return .= chr($_) x $count[$_] for 0..$#count; $return; }

      -- Randal L. Schwartz, Perl hacker

        Thanks again merlyn. This one (l_merlyn2) was considerably slower than your original. A bit of it was the change to regex instead of split to process the strings. Mostly it was more expensive to build the return string. Since I only allow a..z, I changed for 0..$#count; to for (ord('a')..$#count); These changes (l_merlyn3) made it faster than the original.

        l_merlyn: 16 wallclock secs (15.46 usr + 0.00 sys = 15.46 CPU) @ 6623.54/s (n=102400)
        l_merlyn2: 24 wallclock secs (24.14 usr + 0.00 sys = 24.14 CPU) @ 4241.92/s (n=102400)
        l_merlyn3: 15 wallclock secs (14.48 usr + 0.00 sys = 14.48 CPU) @ 7071.82/s (n=102400)