or download this
Benchmark: timing 100000 iterations of optimized, original, paladin...
original: 25 wallclock secs (23.14 usr + 0.00 sys = 23.14 CPU) @ 43
+20.77/s (n=100000)
paladin: 18 wallclock secs (16.93 usr + 0.00 sys = 16.93 CPU) @ 59
+05.63/s (n=100000)