in reply to The indisputable speed of tr///

I agree with you in that the tr operator gives very good speed, so you should probably use it in this case if this takes significanf runtime in your script.

However, let me investigate how much speed I can gain without using the tr operator. I will show that the rot13 solution you show isn't nearly optimal even in that case. (I'll assume that you're translating byte strings.)

#!/usr/bin/env perl use warnings; use strict; use Benchmark qw':all'; my %rot = ( 'A' => 'N','B' => 'O','C' => 'P','D' => 'Q','E' => 'R','F' => 'S', 'G' => 'T','H' => 'U','I' => 'V','J' => 'W','K' => 'X','L' => 'Y', 'M' => 'Z','N' => 'A','O' => 'B','P' => 'C','Q' => 'D','R' => 'E', 'S' => 'F','T' => 'G','U' => 'H','V' => 'I','W' => 'J','X' => 'K', 'Y' => 'L','Z' => 'M','a' => 'n','b' => 'o','c' => 'p','d' => 'q', 'e' => 'r','f' => 's','g' => 't','h' => 'u','i' => 'v','j' => 'w', 'k' => 'x','l' => 'y','m' => 'z','n' => 'a','o' => 'b','p' => 'c', 'q' => 'd','r' => 'e','s' => 'f','t' => 'g','u' => 'h','v' => 'i', 'w' => 'j','x' => 'k','y' => 'l','z' => 'm', ); our(%sub, @subname); sub rot13sub { my($name, $code) = @_; push @subname, $name; $sub{$name} = $code; }
# The original subroutine rot13sub "orig", sub { # as close as possible to source project's code join '', map { exists $rot{$_} ? $rot{$_} : $_ } split('',shift); }; # Firstly, let's fill in the defaults in the hash table so that we # can use a hash slice my %rotd = %rot; exists($rotd{$_}) or $rotd{$_} = $_ for map chr, 0 .. 255; rot13sub "hashslice-s", sub { join '', @rotd{split('',shift)}; };
# Two variants instead of the split rot13sub "hashslice-m", sub { my ($x) = @_; my @t = $x =~ /./gs; join '', @rotd{@t}; }; rot13sub "hashslice-u", sub { my ($x) = @_; join '', @rotd{unpack "(a)*", $x}; };
# Now let's use an array slice instead of a hash my @rot = do { my $c; map { $c = chr($_); exists $rot{$c} ? ord($rot{$c}) : $_ } 0 .. 255; }; rot13sub "arrayslice", sub { pack "C*", @rot[unpack("C*", $_[0])]; }; # Now let's try some solutions with s, as demerphq has suggested. # The first one has to know the exact character range to be replaced. rot13sub "gsub-class", sub { (my $s = $_[0]) =~ s/([a-zA-Z])/$rot{$1}/g; $s; }; rot13sub "gsub-dot", sub { (my $s = $_[0]) =~ s/(.)/$rotd{$1}/gs; $s; }; # Now your tr equivalent rot13sub "tr", sub { (my $s = shift) =~ tr/a-zA-Z/n-za-mN-ZA-M/; $s; };
# First, check correctness. This is important. our $data = join "", <DATA>; our $correct = &{$sub{"orig"}}($data); print $correct; { for my $name (@subname) { my $result = &{$sub{$name}}($data); $correct eq $result or die "error: subroutine $name produces incorrect result"; } warn "all ok"; } # Next, compare them. { cmpthese(-5, {map { my($name, $sub) = ($_, $sub{$_}); $name, sub { my $r = &$sub($data); } } @subname}); } __DATA__ Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Vivamus luct +us nulla sed tellus. Sed vitae sapien in elit vestibulum faucibus. Maecenas sol +licitudin, magna quis vestibulum convallis, ligula nulla fringilla augue, nec ru +trum dolor quam vel justo. Vivamus nisi odio, ullamcorper sed, venenatis id, imp +erdiet eleifend, neque. Vivamus justo felis, euismod in, convallis auctor, eg +estas id, purus. In rutrum nisi in lectus. Aliquam elementum placerat dui. Integ +er ut pede sit amet magna pulvinar placerat. Maecenas massa lorem, lobortis ut, a +dipiscing at, suscipit nec, lectus. Curabitur mattis adipiscing sem. Mauris phar +etra vehicula eros. Sed pellentesque elit laoreet augue. Cras non tellus. Q +uisque volutpat lectus in sem. Fusce vulputate justo ut pede. Aliquam ante pe +de, tempor in, dictum in, blandit eu, sapien. Morbi vestibulum, metus eu auctor v +ulputate, nulla lectus condimentum nisi, ac pulvinar ligula nunc in felis. Curab +itur id orci ac est luctus molestie.

The results on my home computer is:

Rate orig hashslice-m gsub-dot hashslice-s gsub-clas +s hashslice-u arrayslice tr orig 962/s -- -20% -30% -40% -43 +% -58% -82% -99% hashslice-m 1206/s 25% -- -12% -25% -29 +% -47% -78% -99% gsub-dot 1372/s 43% 14% -- -15% -19 +% -40% -75% -99% hashslice-s 1605/s 67% 33% 17% -- -5 +% -30% -70% -99% gsub-class 1687/s 75% 40% 23% 5% - +- -26% -69% -99% hashslice-u 2278/s 137% 89% 66% 42% 35 +% -- -58% -98% arrayslice 5414/s 463% 349% 295% 237% 221 +% 138% -- -96% tr 148174/s 15308% 12184% 10701% 9135% 8681 +% 6403% 2637% --
Which shows that the fastest method I could find that doesn't use tr is still much slower than the tr one, but only 26 times slower, not 150 times slower as the original code you've shown is.

For the way I test here, see also my Benchmarking perfect shuffles. For another case when I use a lookup table instead of transliterations because it's not easy to calculate the correct arguments for tr, see Re (Obfu generator for): Ode for getprotobyname.

Update: added two solutions using s as demerphq has suggested.

Replies are listed 'Best First'.
Re^2: The indisputable speed of tr///
by radiantmatrix (Parson) on Jun 27, 2006 at 14:23 UTC

    Hey, neat!

    I know that the solution using split is tremendously sub-optimal, but that's almost exactly what the code I was given did (just in two steps with an array assignment in between). Thanks for the work on so many ways to do it -- I'm pointing my boss toward your post as further evidence that we now have the optimal solution for that particular chunk of code.

    <radiant.matrix>
    A collection of thoughts and links from the minds of geeks
    The Code that can be seen is not the true Code
    I haven't found a problem yet that can't be solved by a well-placed trebuchet