Recently, I was assigned to optimize some code that substituted certain characters for certain other characters. In other words, the application for which tr/// was created. This used to run a couple dozen times a day, but now that it runs much more, the performance has become unacceptable.
Unfortunately, the previous coder had decided that a hash which mapped one char to another would be a better choice than the transliteration. Above the definition of said hash, s*he makes the comment:
# pre-define relationships for runtime efficiency
I mentioned to my employer that this could probably be re-written with the appropriate transliteration, but it would take me some time to figure out reasonable character classes. The comment, however, led my employer to believe that I might be sacrificing efficiency for "correctness". Therefore, I decided to demonstrate the value of the tr///-based solution using a problem with well-defined char classes -- ROT-13:
#!/usr/bin/perl use strict; use warnings; use Benchmark qw':all'; my %rot = ( 'A' => 'N','B' => 'O','C' => 'P','D' => 'Q','E' => 'R','F' => 'S', 'G' => 'T','H' => 'U','I' => 'V','J' => 'W','K' => 'X','L' => 'Y', 'M' => 'Z','N' => 'A','O' => 'B','P' => 'C','Q' => 'D','R' => 'E', 'S' => 'F','T' => 'G','U' => 'H','V' => 'I','W' => 'J','X' => 'K', 'Y' => 'L','Z' => 'M','a' => 'n','b' => 'o','c' => 'p','d' => 'q', 'e' => 'r','f' => 's','g' => 't','h' => 'u','i' => 'v','j' => 'w', 'k' => 'x','l' => 'y','m' => 'z','n' => 'a','o' => 'b','p' => 'c', 'q' => 'd','r' => 'e','s' => 'f','t' => 'g','u' => 'h','v' => 'i', 'w' => 'j','x' => 'k','y' => 'l','z' => 'm', ); sub rot13 ($) { # as close as possible to source project's code join '', map { exists $rot{$_} ? $rot{$_} : $_ } split('',shift); } sub rot13tr ($) { (my $s = shift) =~ tr/a-zA-Z/n-za-mN-ZA-M/; $s; } our $data = <DATA>; cmpthese( 10000, { 'table' => 'my $a = rot13($data),"\n"', 'tr' => 'my $a = rot13tr($data),"\n"', } ); # for PM-friendliness, I have added newlines to the data -- none are p +resent # in the "production" code. __DATA__ Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Vivamus luct +us nulla sed tellus. Sed vitae sapien in elit vestibulum faucibus. Maecenas sol +licitudin, magna quis vestibulum convallis, ligula nulla fringilla augue, nec ru +trum dolor quam vel justo. Vivamus nisi odio, ullamcorper sed, venenatis id, imp +erdiet eleifend, neque. Vivamus justo felis, euismod in, convallis auctor, eg +estas id, purus. In rutrum nisi in lectus. Aliquam elementum placerat dui. Integ +er ut pede sit amet magna pulvinar placerat. Maecenas massa lorem, lobortis ut, a +dipiscing at, suscipit nec, lectus. Curabitur mattis adipiscing sem. Mauris phar +etra vehicula eros. Sed pellentesque elit laoreet augue. Cras non tellus. Q +uisque volutpat lectus in sem. Fusce vulputate justo ut pede. Aliquam ante pe +de, tempor in, dictum in, blandit eu, sapien. Morbi vestibulum, metus eu auctor v +ulputate, nulla lectus condimentum nisi, ac pulvinar ligula nunc in felis. Curab +itur id orci ac est luctus molestie.
After testing to make sure the same (correct) results were returned from both subs, I ran the benchmark.
Results:
table 387/s -- -100% tr 114286/s 29468% --
No, that's not a mistake: tr/// is nearly 300 times the speed of using a substitution table. Needless to say, my employer approved the time. ;-)
Lessons?
Updates:
In reply to The indisputable speed of tr/// by radiantmatrix
For: | Use: | ||
& | & | ||
< | < | ||
> | > | ||
[ | [ | ||
] | ] |