in reply to Re^3: The indisputable speed of tr///
in thread The indisputable speed of tr///

I wonder ... if all you were doing was some transliteration why would you bother with the lines at all?

Good point. I now have the code altered in test to use sysread to read in 128KB at a time, and there is definitely a boost there. Probably roll that into prod with the next release. Thanks!

why would you spend time "figuring out reasonable character classes? If you have the mapping hash, just build the tr/// out of it.

I started there, actually, with code that looks something like this:

my %rot = ( 'A' => 'N','B' => 'O','C' => 'P','D' => 'Q','E' => 'R','F' => 'S', 'G' => 'T','H' => 'U','I' => 'V','J' => 'W','K' => 'X','L' => 'Y', 'M' => 'Z','N' => 'A','O' => 'B','P' => 'C','Q' => 'D','R' => 'E', 'S' => 'F','T' => 'G','U' => 'H','V' => 'I','W' => 'J','X' => 'K', 'Y' => 'L','Z' => 'M','a' => 'n','b' => 'o','c' => 'p','d' => 'q', 'e' => 'r','f' => 's','g' => 't','h' => 'u','i' => 'v','j' => 'w', 'k' => 'x','l' => 'y','m' => 'z','n' => 'a','o' => 'b','p' => 'c', 'q' => 'd','r' => 'e','s' => 'f','t' => 'g','u' => 'h','v' => 'i', 'w' => 'j','x' => 'k','y' => 'l','z' => 'm', ); my ($tr_a, $tr_b); for (keys %rot) { $tr_a.=$_; $tr_b.=$rot{$_} } print "tr/$tr_a/$tr_b/;\n";

However, I wasn't kidding when I said that ROT-13 is a simplified case of the actual problem! I spent some time finding the char classes, not for runtime efficiency, but for readability/maintainability. There's a reasonable chance of the rules changing, and a tr/// with 300-or-so-char clauses is mighty unreadable.

The string eval is an interesting solution, and one I will have to play around with -- unfortunately, it won't be allowed in prod without an exception filing (string evals are against our coding standards), so there will have to be significant benefit to get it approved. Still, something I hadn't considered, so I thank you for that...

<radiant.matrix>
A collection of thoughts and links from the minds of geeks
The Code that can be seen is not the true Code
I haven't found a problem yet that can't be solved by a well-placed trebuchet

Replies are listed 'Best First'.
Re^5: The indisputable speed of tr///
by ikegami (Patriarch) on Jun 27, 2006 at 16:59 UTC

    His string eval was missing calls to quotemeta. I fixed this, added a few more solutions, and reran the benchmarks. They show that:

    • Without using eval (rot13dynre), your task can be accomplished in 2/3rd of the time it currently takes (rot13splitmap).

      my %rot; @rot{'A'..'M'} = ('N'..'Z'); @rot{'N'..'Z'} = ('A'..'M'); @rot{'a'..'m'} = ('n'..'z'); @rot{'n'..'z'} = ('a'..'m'); my $chars_to_change = join('', map quotemeta, keys %rot); sub rot13dynre { local $_ = @_ ? $_[0] : $_; s/([$chars_to_change])/$rot{$1}/g; return $_; }
    • By using eval (rot13trbuilt), your task can be accomplished in 1/167th of the time it currently takes (rot13splitmap).

      my %rot; @rot{'A'..'M'} = ('N'..'Z'); @rot{'N'..'Z'} = ('A'..'M'); @rot{'a'..'m'} = ('n'..'z'); @rot{'n'..'z'} = ('a'..'m'); *rot13builttr = eval 'sub { local $_ = @_ ? $_[0] : $_; tr{' . join('', map quotemeta, keys %rot) . '} {' . join('', map quotemeta, values %rot) . '}; return $_; }';

    That should help convince your peers that this limited, well controlled, easily testable use of eval is appropriate here.

    Benchmark results:

    Benchmark:

    Update: Reorganization for clarity.

Re^5: The indisputable speed of tr///
by runrig (Abbot) on Jun 27, 2006 at 17:27 UTC
    The string eval is an interesting solution, (snip)... unfortunately, it won't be allowed in prod ...

    There is nothing wrong with string eval when you have complete control over what's going into it, and there's no real speed issue if it's a 'do once' thing where you use the results many times, and it sounds like that's how you will use it. And it's the only way to dynamically get things into tr/// so you can initialize it from, e.g., a hash, instead of hardcoding the tr/// translations (except as Jenda notes above, by generating the script itself, which is essentially the same thing as an eval anyway).

      There is nothing wrong with string eval when you have complete control over what's going into it,

      Oh, I agree with you. I also happen to agree with our coding standards, because far too many people with Perl on their resume are incompetent fools whose experience with Perl stops at making a config change to one of Matt's Scripts. Had I not already put carefully-crafted char classes into prod, I'm sure I could make a case for an exception (they do grant them after a business approval and a code review, which is quite reasonable).

      <radiant.matrix>
      A collection of thoughts and links from the minds of geeks
      The Code that can be seen is not the true Code
      I haven't found a problem yet that can't be solved by a well-placed trebuchet
Re^5: The indisputable speed of tr///
by Jenda (Abbot) on Jun 27, 2006 at 17:30 UTC

    Yep, trying to update the tr/// would be a nightmare if it were this long. I don't know your code, but I thought maybe the best solution would be to keep the "source" of the transformation in the mapping hash (or in an external file) and then either regenerate the tr/// each time you make changes or each time you start the script.

    I understand the worries about string eval, but in this case it is gonna be safe. There will be no stuff comming from outside of the script in the evaled string so you are not loosing any security by this. Plus you may test that all the keys and values in the hash are single characters and escape the specials.

    I think the tr/// syntax could be improved. It's fine if the list of transliterated characters is fairly short, but as it gets longer it's hard to keep the two lists in sync. I think it need's an /x modifier ;-) Maybe like this: tr/a-z => A-Z , +- => -+, 0-9 => 1275489603/x

      Maybe like this: tr/a-z => A-Z , +- => -+, 0-9 => 1275489603/x

      You can always do like this:

      $var =~ tr [a-mn-z] [n-za-m];

      --
      David Serrano

Re^5: The indisputable speed of tr///
by duff (Parson) on Jun 27, 2006 at 18:32 UTC

    In perl6, this would Just Work:

    $string.trans(%rot); # perl6 version of tr///
    Which I think is quite cool. :-)