Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Heya. The example below creates a file named 东西.txt, which is wrong. Operating system is Win2K, file system is NTFS. Generally I have no problems with filenames in unicode. What to do?

I have found Win32 API directory searches that return wide / unicode filenames by super search, but I can't understand it enough to change it to create files.

perl -v This is perl, v5.8.7 built for MSWin32-x86-multi-thread (with 14 registered patches, see perl -V for more detail) Binary build 815 [211909] provided by ActiveState Built Nov 2 2005 08:44:52
#!perl use utf8; use strict; use diagnostics; my $name = '&#19996;&#35199;.txt'; { open my $fh, '>', $name or die "could not open file <$name> for wr +iting: $!"; binmode $fh, ':utf8'; print $fh "&#20320;&#22909;&#19990;&#30028;\n"; close $fh; };

PS: I did not use any HTML entities in my submission. The e2 software is just being "helpful". You figure out what I really wrote.

Replies are listed 'Best First'.
Re: Unicode in filenames
by blahblahblah (Priest) on Feb 13, 2006 at 02:42 UTC
    Maybe you could encode it using a local encoding, something like this?:
    use Encode; $decodedLocalName = decode('utf8', $name); $decodedLocalName = encode('iso-8859-1', $decodedLocalName); open my $fh, '>', $decodedLocalName ...
    You said you generally have no problem with unicode filenames. Do you mean that on this same OS, similar unicode characters work in similar perl scripts, or is there something special about this case?

Re: Unicode in filenames
by BrowserUk (Patriarch) on Feb 13, 2006 at 03:40 UTC

    There used to be a command line switch -C which enabled the use of the wide character code Win32 apis within perl. For some inexplicable reason this was dropped and the switch recycled at some point.

    You can get access to most of the W postfix apis, including CreateFileW() via tye's Win32API::File. You'll also need OsFHandleOpen() to 'convert' the Native filehandle returned by CreateFile() in to a Perl filehandle before you can use the normal Perl fileIO constructs.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      For some inexplicable reason this was dropped and the switch recycled at some point.
      perl581delta
      (Win32) The -C Switch Has Been Repurposed The -C switch has changed in an incompatible way. The old semantics of this switch only made sense in Win32 and only in the "use utf8" universe in 5.6.x releases, and do not make sense for the Unicode implementation in 5.8.0. Since this switch could not have been used by anyone, it has been repurposed. The behavior that this switch enabled in 5.6.x releases may be supported in a transparent, data-dependent fashion in a future release.