I'm using Text::Query::Advanced to let the user search in a number of documents. The problem I have is that most of these documents are written in Spanish (yes, I'm Spanish, that's the reason of my bad English ;)), and they have "funny" characters, i.e.: áéíóú... . The problem is that the search "camion" should match the word "camión" as well as "camion", so I'm trying to figure out a simple way to get rid of those characters.
A simple substitution may work:
s/á/a/g; s/é/e/g; ... s/Ú/u/g;
But that would make it too slow, as it would have to do lot of passes through the string (maybe a long string), and I have to be aware of more characters in a future (â, ä, à, ...)
I've tried the tr/áéíóú/aeiou/ solution, but as "á" is a two byte character, it doesn't work.
I've read perluniintro and perlunicode, but I've not found anything that can help me.
Any ideas are welcome.
Thanks in advance,
deibyz
Edited: Title changed.
In reply to Playing with extended chars by deibyz
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |