deibyz has asked for the wisdom of the Perl Monks concerning the following question:
I'm using Text::Query::Advanced to let the user search in a number of documents. The problem I have is that most of these documents are written in Spanish (yes, I'm Spanish, that's the reason of my bad English ;)), and they have "funny" characters, i.e.: áéíóú... . The problem is that the search "camion" should match the word "camión" as well as "camion", so I'm trying to figure out a simple way to get rid of those characters.
A simple substitution may work:
s/á/a/g; s/é/e/g; ... s/Ú/u/g;
But that would make it too slow, as it would have to do lot of passes through the string (maybe a long string), and I have to be aware of more characters in a future (â, ä, à, ...)
I've tried the tr/áéíóú/aeiou/ solution, but as "á" is a two byte character, it doesn't work.
I've read perluniintro and perlunicode, but I've not found anything that can help me.
Any ideas are welcome.
Thanks in advance,
deibyz
Edited: Title changed.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Playing with "funny" chars
by Eyck (Priest) on Sep 27, 2004 at 12:19 UTC | |
|
Re: Playing with "funny" chars
by cog (Parson) on Sep 27, 2004 at 15:09 UTC | |
by deibyz (Hermit) on Sep 27, 2004 at 15:19 UTC | |
by cog (Parson) on Sep 27, 2004 at 16:29 UTC | |
|
Re: Playing with "funny" chars
by mischief (Hermit) on Sep 27, 2004 at 12:57 UTC | |
by itub (Priest) on Sep 27, 2004 at 13:12 UTC | |
by deibyz (Hermit) on Sep 27, 2004 at 13:31 UTC | |
by itub (Priest) on Sep 27, 2004 at 15:42 UTC | |
|
Re: Playing with extended chars
by chanio (Priest) on Sep 27, 2004 at 23:14 UTC |