Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re: Foreign language characters...

by seattlejohn (Deacon)
on Oct 08, 2002 at 14:57 UTC ( [id://203664]=note: print w/replies, xml ) Need Help??


in reply to Foreign language characters...

There are lots and lots of characters that can cause problems like this, and you really have no hope of trying to enumerate them all. You should think seriously about translating everything except a small subset of legal characters to underscores. That way you will catch accented characters, Unicode characters, undesirable sequences of characters such as .. and ~ and /, and so on. Something like this would probably do:$name =~ tr/-A-Za-z0-9/_/c; I know this doesn't completely answer your question, because you asked how to translate accented characters to their unaccented versions, not how to replace them across the board. The big problem with the de-accenting approach is that there are a huge variety of accented characters you potentially have to deal with. What's more, I believe that the codes used for accented characters (which are not part of standard ASCII) will vary depending on what character set you are actually using.

You didn't say explicitly, but I'm assuming you're creating these Web pages via CGI. Security is something you'll need to approach very seriously if you are doing things like writing files based on user-supplied filenames. You might also want to read up on taint mode and check out Ovid's CGI Course for some further hints.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://203664]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (6)
As of 2024-04-19 11:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found