Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Handling file path with unusual characters

by samuelalfred (Sexton)
on Feb 24, 2009 at 09:07 UTC ( [id://745941]=perlquestion: print w/replies, xml ) Need Help??

samuelalfred has asked for the wisdom of the Perl Monks concerning the following question:

Hello, I'm writing a script with some GUI applications using perl tkx. Among other things the user has the possibility to open a file using a tk___getOpenFile dialog. This has been working just fine until recentley when I tried opening a file which path contained certain "special" characters (such as å,ä,ö or é) which produced a program error. For debuging, I printed the path in the command window and found out that perl had replaced these characters with other weird ones. Is there any command or setting to make perl interpret my swedish characters in a proper way? Thank you in advance!
  • Comment on Handling file path with unusual characters

Replies are listed 'Best First'.
Re: Handling file path with unusual characters
by almut (Canon) on Feb 24, 2009 at 11:00 UTC

    Encoding issues are notoriously tricky — in particular when filenames are involved.  And with the info you've provided it's not exactly easy to come up with a do-that-and-you'll-be-happy recipe solution :)

    The first step when debugging encoding problems is to find out what encodings are involved.  In this particular case, it would be important to know the encoding the filesystem uses for filenames. Once you know that, you can decode the filenames on input (coming from the filesystem), and encode them on output from/to the respective encoding being used. Perl's handling of filenames does not involve any fancy auto-conversion magic, it's all plain octets.  (Even on OSses like Windows, where there is a Unicode interface (aka "wide API") for filenames - meaning the OS would take care of converting from/to the system's native encoding - Perl does currently not make use of it.)

    OTOH, tk___getOpenFile() might be doing its own thing... (I've never used tkx, so I can't tell). In other words, you'd have to check whether the names for the selected file(s) are still in the same (raw) encoding being used in the filesystem.  To check such questions, it's always a good idea to use hexdumps or, better yet, Devel::Peek (which also informs about the state of Perl's utf8 flag) — simply printing the strings to your terminal, or viewing them in your favorite editor, browser, or some such, might involve implicit encoding conversions, font issues, etc. that all in all will typically only serve to confuse matters even more... (unless you know your tools very well).

Re: Handling file path with unusual characters
by ELISHEVA (Prior) on Feb 24, 2009 at 13:25 UTC
      Hello,

      Thanks for the reply. I've read the article and tested the script that can be used to determined what type of encoding that is used in the environment I'm running. However, I was not able to find a single one of all encodings that produced a correct result. What to do then? If I've understood this correctly the encode command could solve my issue but I need to know which encoding to use.

      Thanks!
Re: Handling file path with unusual characters
by poolpi (Hermit) on Feb 24, 2009 at 10:05 UTC

    I think you should encode perl's string internal form into your encoding with encode from Encode

    my $octets = encode($encoding, $path);


    hth,
    PooLpi

    'Ebry haffa hoe hab im tik a bush'. Jamaican proverb
Re: Handling file path with unusual characters
by DrHyde (Prior) on Feb 25, 2009 at 11:09 UTC

    For anything other than ASCII to work, you need to make sure that several things support it, and to configure them. First, your code and the language it's in needs to support your chosen character set. This also includes any third-party libraries that you use. Second, the OS needs to support that character set. Third, your display device or terminal emulator needs to support it. Fourth, your filesystem needs to support it.

    There is no consistent comprehensive documentation for doing this - I've looked several times for both Linux and OS X, and it simply doesn't exist.

    Good luck.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://745941]
Approved by Corion
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (3)
As of 2024-03-29 15:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found