Recently, I was assigned to optimize some code that substituted certain characters for certain other characters. In other words, the application for which tr/// was created. This used to run a couple dozen times a day, but now that it runs much more, the performance has become unacceptable.

Unfortunately, the previous coder had decided that a hash which mapped one char to another would be a better choice than the transliteration. Above the definition of said hash, s*he makes the comment:

# pre-define relationships for runtime efficiency

I mentioned to my employer that this could probably be re-written with the appropriate transliteration, but it would take me some time to figure out reasonable character classes. The comment, however, led my employer to believe that I might be sacrificing efficiency for "correctness". Therefore, I decided to demonstrate the value of the tr///-based solution using a problem with well-defined char classes -- ROT-13:

#!/usr/bin/perl use strict; use warnings; use Benchmark qw':all'; my %rot = ( 'A' => 'N','B' => 'O','C' => 'P','D' => 'Q','E' => 'R','F' => 'S', 'G' => 'T','H' => 'U','I' => 'V','J' => 'W','K' => 'X','L' => 'Y', 'M' => 'Z','N' => 'A','O' => 'B','P' => 'C','Q' => 'D','R' => 'E', 'S' => 'F','T' => 'G','U' => 'H','V' => 'I','W' => 'J','X' => 'K', 'Y' => 'L','Z' => 'M','a' => 'n','b' => 'o','c' => 'p','d' => 'q', 'e' => 'r','f' => 's','g' => 't','h' => 'u','i' => 'v','j' => 'w', 'k' => 'x','l' => 'y','m' => 'z','n' => 'a','o' => 'b','p' => 'c', 'q' => 'd','r' => 'e','s' => 'f','t' => 'g','u' => 'h','v' => 'i', 'w' => 'j','x' => 'k','y' => 'l','z' => 'm', ); sub rot13 ($) { # as close as possible to source project's code join '', map { exists $rot{$_} ? $rot{$_} : $_ } split('',shift); } sub rot13tr ($) { (my $s = shift) =~ tr/a-zA-Z/n-za-mN-ZA-M/; $s; } our $data = <DATA>; cmpthese( 10000, { 'table' => 'my $a = rot13($data),"\n"', 'tr' => 'my $a = rot13tr($data),"\n"', } ); # for PM-friendliness, I have added newlines to the data -- none are p +resent # in the "production" code. __DATA__ Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Vivamus luct +us nulla sed tellus. Sed vitae sapien in elit vestibulum faucibus. Maecenas sol +licitudin, magna quis vestibulum convallis, ligula nulla fringilla augue, nec ru +trum dolor quam vel justo. Vivamus nisi odio, ullamcorper sed, venenatis id, imp +erdiet eleifend, neque. Vivamus justo felis, euismod in, convallis auctor, eg +estas id, purus. In rutrum nisi in lectus. Aliquam elementum placerat dui. Integ +er ut pede sit amet magna pulvinar placerat. Maecenas massa lorem, lobortis ut, a +dipiscing at, suscipit nec, lectus. Curabitur mattis adipiscing sem. Mauris phar +etra vehicula eros. Sed pellentesque elit laoreet augue. Cras non tellus. Q +uisque volutpat lectus in sem. Fusce vulputate justo ut pede. Aliquam ante pe +de, tempor in, dictum in, blandit eu, sapien. Morbi vestibulum, metus eu auctor v +ulputate, nulla lectus condimentum nisi, ac pulvinar ligula nunc in felis. Curab +itur id orci ac est luctus molestie.

After testing to make sure the same (correct) results were returned from both subs, I ran the benchmark.

Results:

table 387/s -- -100% tr 114286/s 29468% --

No, that's not a mistake: tr/// is nearly 300 times the speed of using a substitution table. Needless to say, my employer approved the time. ;-)

Lessons?

  1. When doing things "for efficiency", make sure they're actually efficient.
  2. A little demonstration goes a long way.
  3. Sometimes it's worth thinking more up-front.

Updates:

<radiant.matrix>
A collection of thoughts and links from the minds of geeks
The Code that can be seen is not the true Code
I haven't found a problem yet that can't be solved by a well-placed trebuchet

In reply to The indisputable speed of tr/// by radiantmatrix

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.