in reply to Re^3: The Björk Situation
in thread The Björk Situation
Actually, now that I've had a moment to look at it, unidecode DOESN'T fare so well, strictly from a speed point of view.
You made the mistake of modifying $string directly so that in all but the first call, there are NO characters that need to be transliterated so it benchmarked much faster. Once that is fixed, it doesn't have such a big lead. (Actually, none at all ;-) )
unidecode => sub{ my $text = $string; return unidecode($text); },
Yields:
Rate unidecode deaccent2 deaccent
unidecode 6797/s -- -3% -87%
deaccent2 6979/s 3% -- -86%
deaccent 50687/s 646% 626% --
Never-the-less, unidecode probably IS the best choice as it handles Unicode up to \xFFFF not just up to \xFF.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^5: The Björk Situation
by rhesa (Vicar) on Feb 16, 2006 at 00:30 UTC |