Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have a form that uploads files. Some of these files have names with Chinese characters. I need to save the files on the server and must maintain their names. I am using CGI.pm

How can I do this? Thanks.

  • Comment on Saving file name with Chinese characters

Replies are listed 'Best First'.
Re: Saving file name with Chinese characters
by ikegami (Patriarch) on Apr 13, 2009 at 18:09 UTC

    Unix interprets file names as series of characters in the current locale's encoding, often UTF-8. I don't know how to get the correct encoding, but you could look at how the open pragma does it. Encode's encode can be used to encode the file name once you know the encoding.

    Windows uses UCS-2le internally. As such, it supports all Unicode characters in BMP0 (those up to U+FFFF). However, Perl doesn't use the system call capable of handling those characters. You'll need to create or open the file using Win32API::File's CreateFileW. CreateFileW expects you to encode the file name yourself (using UCS-2le).

    Update: I wasn't sure if Windows used UCS-2le or UTF-16le, so I put it to the test. It won't let me create a file with U+10000 in its name, ruling out UTF-16le. I adjusted the above accordingly.

      Can you please show an example of using CreateFileW? I looked at the docs, but I am not too sure. Thanks.
        Look under CreateFile (note the case) for a detailed description of each arg.

        That leaves getting a Perl handle from the Win32API::File object:

        use strict; use warnings; use Encode qw( encode ); use Symbol qw( gensym ); use Win32API::File qw( CreateFileW OsFHandleOpen CREATE_ALWAYS GENERIC_WRITE ); my $qfn = chr(0x2660); # Whatever my $win32f = CreateFileW( encode('UCS-2le', $qfn), GENERIC_WRITE, # For writing 0, # Not shared [], # Security attributes CREATE_ALWAYS, # Create and replace 0, # Special flags [], # Permission template ) or die("CreateFile: $^E\n"); OsFHandleOpen( my $fh = gensym(), $win32f, 'w' ) or die("OsFHandleOpen: $^E\n"); print $fh "Foo!\n";
      However, Perl doesn't use the system call capable of handling those characters.

      I've tried it by open, file could be created successfully. I guess server don't installed East Asian language package.

        Not quite. A file is created, but it doesn't have the right name. For example, If you use CreateFileW, you can actually create a file whose name is the single character "人" ("person"). If you try to create such a file with open, it'll create the three char file name "人".