in reply to Writing UTF8 Filename

I am trying to write a utf8 filename with mv...

Why? Let me suggest that unless you have a really, seriously, inescapably unavoidable and compulsory reason for doing this, you would be much better off not doing it. Just don't.

The current state of support for non-ASCII characters in file names is not what I would call "stable" (or "sane" or "worth the hassle"). It is likely to vary in significant and perplexing ways across various (versions of) operating systems and file/text transfer protocols. Even on a single system where non-ASCII file names seem to "work", you are likely to discover a crippling amount of "variability" among various applications currently running on that system in terms of how (or whether) they deal with non-ASCII characters in file names.

If you are just trying to spruce up the appearance of your music collection, use a database or XML structure that relates sensible (ASCII-only) file names to whatever sort of strings you want to see as the list of files.

If you actually do believe there is an unavoidable, compulsory need for this, try to think of a work-around that involves using ASCII-only strings. If you can't... well, perhaps other replies here will help, but the solution may be OS dependent, and you might regret it later. Good luck.

(BTW, you might find it easier to use the perl built-in function "rename" -- it saves you from worrying about what happens to non-ASCII data being passed as command-line args to a sub-shell.)

Replies are listed 'Best First'.
Re^2: Writing UTF8 Filename
by Juerd (Abbot) on Nov 17, 2007 at 09:47 UTC

    (BTW, you might find it easier to use the perl built-in function "rename" -- it saves you from worrying about what happens to non-ASCII data being passed as command-line args to a sub-shell.)

    Yeah, instead of worrying about what happens if you do a system call, you now have to worry about what happens if you do a system call. The same thing happens: latin1 or utf8 encoding may be used, depending on the circumstances. Thus: encode explicitly.

    Juerd # { site => 'juerd.nl', do_not_use => 'spamtrap', perl6_server => 'feather' }

      Thank you all for your responses.

      Juerd, I am wondering what you mean when you keep saying "encode explicitly":

      The unicode strings I am trying to set as the filename are encoded as utf8 from a web form. They go into the utf8-character-set encoded mysql database. They display properly on the web form. What step for encoding explicitly could I be missing? It seems to me that they begin life as utf8 and they stay that way. How do I get more explicit?

        Juerd, I am wondering what you mean when you keep saying "encode explicitly":

        Perhaps it is time to read the Perl Unicode Tutorial :-).

        If they began life as utf8, they do indeed very probably stay that way if you do nothing to them, but if that were the case, I think you might not have been asking the question that you have.

        Encoding explicitly means to use encode() or encode_utf8(), in this case. It would also require that the values coming from the database are decoded at some point. The DBD::mysql module can do this for you. I don't know if it is.

Re^2: Writing UTF8 Filename
by eserte (Deacon) on Nov 17, 2007 at 16:34 UTC
    The current state of support for non-ASCII characters in file names is not what I would call "stable" (or "sane" or "worth the hassle").
    Simply name it: it's non-existent.

    Maybe we'll have encoded filenames support in perl 5.12?

      Maybe we'll have encoded filenames support in perl 5.12?

      My "current state of support" comment was a reference to OS-level issues (on whatever OS). I would not expect such support from perl any time soon, given that there is no consistent form of OS support.

        Well, Win32 has had stable support for Unicode filenames for many years. Perl's support for that is woeful but I'm working on that. I'd make a joke about "Real" operating systems, but it seems the Unix mongers may need a break from that in order to recover their sense of humor. (:

        - tye