Your suspicions are confirmed. The optimization has sped my strings-based tokenizer up by 39%. Interestingly, with the expanded test string I'm using, my original slightly edged out the optimized version of your single-regex. Here are the results:
Rate lists oneregex str_org str_opt lists 2347/s -- -38% -45% -60% oneregex 3807/s 62% -- -10% -35% str_org 4237/s 81% 11% -- -28% str_opt 5882/s 151% 55% 39% --
This optimization is definitely effective. Thanks very much.
Oh. And here's the expanded test string if you want to play with it:
my $msg = q{This, is, an, example. Keep $2.50, 1,500, and 192.168.1.1. + I want to work this thing out a LITTEL!!!!L BITH!!!!! MORE@@@@@@ with some,.unhapp.yword,combinations.and , a little .. bit of,, confusing,text hopefully @#@#@#@%#$57)#$*(#&)(*$ it will @#@][] work.} +;
In reply to Re: Re: Re: Re: tokenize plain text messages
by revdiablo
in thread tokenize plain text messages
by revdiablo
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |