Recently, I was assigned to optimize some code that substituted certain characters for certain other characters. In other words, the application for which tr/// was created. This used to run a couple dozen times a day, but now that it runs much more, the performance has become unacceptable.

Unfortunately, the previous coder had decided that a hash which mapped one char to another would be a better choice than the transliteration. Above the definition of said hash, s*he makes the comment:

# pre-define relationships for runtime efficiency

I mentioned to my employer that this could probably be re-written with the appropriate transliteration, but it would take me some time to figure out reasonable character classes. The comment, however, led my employer to believe that I might be sacrificing efficiency for "correctness". Therefore, I decided to demonstrate the value of the tr///-based solution using a problem with well-defined char classes -- ROT-13:

#!/usr/bin/perl use strict; use warnings; use Benchmark qw':all'; my %rot = ( 'A' => 'N','B' => 'O','C' => 'P','D' => 'Q','E' => 'R','F' => 'S', 'G' => 'T','H' => 'U','I' => 'V','J' => 'W','K' => 'X','L' => 'Y', 'M' => 'Z','N' => 'A','O' => 'B','P' => 'C','Q' => 'D','R' => 'E', 'S' => 'F','T' => 'G','U' => 'H','V' => 'I','W' => 'J','X' => 'K', 'Y' => 'L','Z' => 'M','a' => 'n','b' => 'o','c' => 'p','d' => 'q', 'e' => 'r','f' => 's','g' => 't','h' => 'u','i' => 'v','j' => 'w', 'k' => 'x','l' => 'y','m' => 'z','n' => 'a','o' => 'b','p' => 'c', 'q' => 'd','r' => 'e','s' => 'f','t' => 'g','u' => 'h','v' => 'i', 'w' => 'j','x' => 'k','y' => 'l','z' => 'm', ); sub rot13 ($) { # as close as possible to source project's code join '', map { exists $rot{$_} ? $rot{$_} : $_ } split('',shift); } sub rot13tr ($) { (my $s = shift) =~ tr/a-zA-Z/n-za-mN-ZA-M/; $s; } our $data = <DATA>; cmpthese( 10000, { 'table' => 'my $a = rot13($data),"\n"', 'tr' => 'my $a = rot13tr($data),"\n"', } ); # for PM-friendliness, I have added newlines to the data -- none are p +resent # in the "production" code. __DATA__ Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Vivamus luct +us nulla sed tellus. Sed vitae sapien in elit vestibulum faucibus. Maecenas sol +licitudin, magna quis vestibulum convallis, ligula nulla fringilla augue, nec ru +trum dolor quam vel justo. Vivamus nisi odio, ullamcorper sed, venenatis id, imp +erdiet eleifend, neque. Vivamus justo felis, euismod in, convallis auctor, eg +estas id, purus. In rutrum nisi in lectus. Aliquam elementum placerat dui. Integ +er ut pede sit amet magna pulvinar placerat. Maecenas massa lorem, lobortis ut, a +dipiscing at, suscipit nec, lectus. Curabitur mattis adipiscing sem. Mauris phar +etra vehicula eros. Sed pellentesque elit laoreet augue. Cras non tellus. Q +uisque volutpat lectus in sem. Fusce vulputate justo ut pede. Aliquam ante pe +de, tempor in, dictum in, blandit eu, sapien. Morbi vestibulum, metus eu auctor v +ulputate, nulla lectus condimentum nisi, ac pulvinar ligula nunc in felis. Curab +itur id orci ac est luctus molestie.

After testing to make sure the same (correct) results were returned from both subs, I ran the benchmark.

Results:

table 387/s -- -100% tr 114286/s 29468% --

No, that's not a mistake: tr/// is nearly 300 times the speed of using a substitution table. Needless to say, my employer approved the time. ;-)

Lessons?

  1. When doing things "for efficiency", make sure they're actually efficient.
  2. A little demonstration goes a long way.
  3. Sometimes it's worth thinking more up-front.

Updates:

<radiant.matrix>
A collection of thoughts and links from the minds of geeks
The Code that can be seen is not the true Code
I haven't found a problem yet that can't be solved by a well-placed trebuchet

Replies are listed 'Best First'.
Re: The indisputable speed of tr///
by ikegami (Patriarch) on Jun 26, 2006 at 19:54 UTC
    • Bad test. Change
      my $data = <DATA>;
      to
      our $data = <DATA>;
      so your code can see the variable.

      You'll see the error if you do:

      'table' => 'use strict; my $a = rot13($data),"\n"', 'tr' => 'use strict; my $a = rot13tr($data),"\n"',

      Oddly, the results show you didn't suffer from the bug when you ran your code. The code you posted code and the code you ran are not the same.

    • Buggy test. You only read in the first line of data.

    • Bad interpretation.
      "nearly 300 times faster than"
      should read
      "nearly 200 times faster than"
      or
      "nearly 300 times the speed of"

      ( No, he was right. )

    New results:

    Benchmark: running table, tr, each for at least 3 CPU seconds... table: 4 wallclock secs ( 3.13 usr + 0.00 sys = 3.13 CPU) @ 33 +1.84/s (n=1040) tr: 3 wallclock secs ( 3.23 usr + 0.00 sys = 3.23 CPU) @ 65 +432.56/s (n=211020) Rate table tr table 332/s -- -99% tr 65433/s 19618% --

    (tr is 100x faster than table.)

    New Benmark code:

      As for the my -> our change, thanks; I ran into and fixed that bug, must've pasted an old version. Fixed now.

      As for "first line", you'll note the commment above the DATA block that I added newlines to the data for posting on PM, but that the code I ran had *no* newlines in the DATA block. All the data was on the first line, in other words. Even if that were not the case, it wouldn't be buggy, it would just be testing less text.

      As for the "300x faster than", you're technically correct, it's "300x as fast as". However, you're also a pedant. ;-) Fixed, in any case.

      <radiant.matrix>
      A collection of thoughts and links from the minds of geeks
      The Code that can be seen is not the true Code
      I haven't found a problem yet that can't be solved by a well-placed trebuchet
Re: The indisputable speed of tr///
by Zaxo (Archbishop) on Jun 26, 2006 at 20:06 UTC

    Just one more question,

    • Did it help?

    In my experience, I/O bottlenecks or pathological algorithms are usually the culprits when apps become too slow.

    That said, tr/// is certainly the right way to do it. Teardown time alone on that hash is probably longer than tr/// takes - not to mention all the memory management and copying involved in the split.

    The operation is only done once, though, and 300 times nothing can still be pretty small.

    After Compline,
    Zaxo

      Did it help? Quite a bit, actually. This was one component in an otherwise quite decent system, written by a coder who was essentially a temp. The file-IO stuff was handled as efficiently as reasonable in Perl and with appropriate error-checking. It's entirely a File-IO type of system.

      The operation is only done once, though, and 300 times nothing can still be pretty small.

      See, that's just it. We're talking about large chunks of text, handled a line at a time for memory's sake -- so this operation is repeated tens of thousands of times per run, and many thousands of runs per day (many of which are in parallel -- this doesn't slam the processor by any stretch, since we wait mostly on I/O).

      Now, the actual performance difference in the app at large was no where near the 30_000% in the demonstration but it was significant, and certainly worth doing. This change alone made roughly a 350% difference, and that was after improving the I/O routine to read and process up to 500 lines of text at a time, instead of one.

      <radiant.matrix>
      A collection of thoughts and links from the minds of geeks
      The Code that can be seen is not the true Code
      I haven't found a problem yet that can't be solved by a well-placed trebuchet

        I wonder ... if all you were doing was some transliteration why would you bother with the lines at all? Read and process say 64 KB at once. As I said, not sure this is doable in your case.

        Second ... why would you spend time "figuring out reasonable character classes? If you have the mapping hash, just build the tr/// out of it. I just tried to benchmark the difference between tr/a-z/A-Z/ and tr/abcdef...xyz/ABCDEF...ZYX/ and the later actually came out a tiny little bit faster. So you could either have the tr/// generated from the hash and copy the result into the script or use string eval to define a subroutine to do the transliteration.  my $sub = eval 'sub { my $x = shift(); $x=~ tr/' . join('', keys %h) . '/' . join('', values %h) . '/; return $x}'; or something. If the loop is simple enough you could eval the whole loop so that you do not waste time by the subroutine call for each line. Or am I missing something?

Re: The indisputable speed of tr///
by ambrus (Abbot) on Jun 27, 2006 at 09:05 UTC

    I agree with you in that the tr operator gives very good speed, so you should probably use it in this case if this takes significanf runtime in your script.

    However, let me investigate how much speed I can gain without using the tr operator. I will show that the rot13 solution you show isn't nearly optimal even in that case. (I'll assume that you're translating byte strings.)

    # The original subroutine rot13sub "orig", sub { # as close as possible to source project's code join '', map { exists $rot{$_} ? $rot{$_} : $_ } split('',shift); }; # Firstly, let's fill in the defaults in the hash table so that we # can use a hash slice my %rotd = %rot; exists($rotd{$_}) or $rotd{$_} = $_ for map chr, 0 .. 255; rot13sub "hashslice-s", sub { join '', @rotd{split('',shift)}; };
    # Now let's use an array slice instead of a hash my @rot = do { my $c; map { $c = chr($_); exists $rot{$c} ? ord($rot{$c}) : $_ } 0 .. 255; }; rot13sub "arrayslice", sub { pack "C*", @rot[unpack("C*", $_[0])]; }; # Now let's try some solutions with s, as demerphq has suggested. # The first one has to know the exact character range to be replaced. rot13sub "gsub-class", sub { (my $s = $_[0]) =~ s/([a-zA-Z])/$rot{$1}/g; $s; }; rot13sub "gsub-dot", sub { (my $s = $_[0]) =~ s/(.)/$rotd{$1}/gs; $s; }; # Now your tr equivalent rot13sub "tr", sub { (my $s = shift) =~ tr/a-zA-Z/n-za-mN-ZA-M/; $s; };

    The results on my home computer is:

    Rate orig hashslice-m gsub-dot hashslice-s gsub-clas +s hashslice-u arrayslice tr orig 962/s -- -20% -30% -40% -43 +% -58% -82% -99% hashslice-m 1206/s 25% -- -12% -25% -29 +% -47% -78% -99% gsub-dot 1372/s 43% 14% -- -15% -19 +% -40% -75% -99% hashslice-s 1605/s 67% 33% 17% -- -5 +% -30% -70% -99% gsub-class 1687/s 75% 40% 23% 5% - +- -26% -69% -99% hashslice-u 2278/s 137% 89% 66% 42% 35 +% -- -58% -98% arrayslice 5414/s 463% 349% 295% 237% 221 +% 138% -- -96% tr 148174/s 15308% 12184% 10701% 9135% 8681 +% 6403% 2637% --
    Which shows that the fastest method I could find that doesn't use tr is still much slower than the tr one, but only 26 times slower, not 150 times slower as the original code you've shown is.

    For the way I test here, see also my Benchmarking perfect shuffles. For another case when I use a lookup table instead of transliterations because it's not easy to calculate the correct arguments for tr, see Re (Obfu generator for): Ode for getprotobyname.

    Update: added two solutions using s as demerphq has suggested.

      Hey, neat!

      I know that the solution using split is tremendously sub-optimal, but that's almost exactly what the code I was given did (just in two steps with an array assignment in between). Thanks for the work on so many ways to do it -- I'm pointing my boss toward your post as further evidence that we now have the optimal solution for that particular chunk of code.

      <radiant.matrix>
      A collection of thoughts and links from the minds of geeks
      The Code that can be seen is not the true Code
      I haven't found a problem yet that can't be solved by a well-placed trebuchet
Re: The indisputable speed of tr///
by demerphq (Chancellor) on Jun 27, 2006 at 08:15 UTC

    tr/// is only going to help you if all of your transformations are 1:1 in terms of chars. If there are any examples of 1:N then its not going to help you. IE, if you need to convert chars with umlouts to their non-umlaut equivelents then tr/// wont help you. As for it being faster im surprised there was any question. Especially with the code you are using. join/map/split is a pretty horrible way to handle this. If i had to use a hash id do something like s/([...])/$map{$1}/g which is isnt going to be faster than tr/// but will certainly be faster than join/map/split.

    ---
    $world=~s/war/peace/g

Re: The indisputable speed of tr///
by shmem (Chancellor) on Jun 27, 2006 at 08:52 UTC
    Basically you are saying that using the right tool makes the code run faster. Agreed. Why use s/// or join,map,split for simple transliterations?

    I wouldn't use a 40 ton truck to go fetch a package of cigarettes, if I have a bike ;-)

    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
Re: The indisputable speed of tr///
by Ieronim (Friar) on Jun 28, 2006 at 16:32 UTC
    The line
    our $data = <DATA>;
    is still useless because you read to $data only the first string from __DATA__ section :)
    Please modify it to
    our $data = join "", <DATA>;
    IMHO it is exactly what you wanted to do:)

    The worst thing about tr/// is that it can produce only character-to-character mapping. The best solution aviding this problem I found uses eval at compilation time:

    BEGIN { our %rot = ( 'A' => 'N','B' => 'O','C' => 'P','D' => 'Q','E' => 'R','F' => 'S', 'G' => 'T','H' => 'U','I' => 'V','J' => 'W','K' => 'X','L' => 'Y', 'M' => 'Z','N' => 'A','O' => 'B','P' => 'C','Q' => 'D','R' => 'E', 'S' => 'F','T' => 'G','U' => 'H','V' => 'I','W' => 'J','X' => 'K', 'Y' => 'L','Z' => 'M','a' => 'n','b' => 'o','c' => 'p','d' => 'q', 'e' => 'r','f' => 's','g' => 't','h' => 'u','i' => 'v','j' => 'w', 'k' => 'x','l' => 'y','m' => 'z','n' => 'a','o' => 'b','p' => 'c', 'q' => 'd','r' => 'e','s' => 'f','t' => 'g','u' => 'h','v' => 'i', 'w' => 'j','x' => 'k','y' => 'l','z' => 'm', ); my $substs = join "\n", map { " s/$_/$rot{$_}/g;" } keys %rot; eval <<SUB; sub rot13s { local \$_ = shift; $substs; return \$_; } SUB }
    It is only 52 times slower than tr/// :
    Rate table s///g tr table 448/s -- -78% -100% s///g 2026/s 352% -- -98% tr 106383/s 23654% 5152% --

      The problem about only reading the first line of the string was raised and fixed in the first post of this thread.

      Your solution doesn't work. For example,
      print(rot13s(rot13s('AN')), "\n");
      prints 'AA' or 'NN' (depending on the hash ordering), but not the expected 'AN'.

      You also don't escape the keys and value of %rot as you should. The OP said this is a simplification of his problem, so the presense of punctuation in %rot is likely.

      Refer to an earlier post of mine for a working solution using eval to build a substitution and a working solution that uses substitution without using eval.

      "[rot13s] is only 52 times slower than [rot13tr]" makes no sense. I think you meant "rot13tr is only 52 times faster than rot13s".

      *Only* 52 times faster? So if rot13tr takes 30 seconds — The OP said he had lots of data to process — rot13s takes *only* 26 minutes? I wish my 30 minutes drive home every night only took 30 seconds.

        yes, i was wrong :) i forgot that i at first substitute all A to N and then all N to A, INCLUDING those A's which were already mapped to N :))