in reply to Opening files with japanese/chinese chars in filename

The perl 5.10 todo wish list states that functions like chdir, opendir, readdir, readlink, rename, rmdir e.g
"could potentially accept Unicode filenames either as input or output".
Windows default encoding is UTF-16LE,but the console 'dir' command will only return ANSI names.Thus unicode characters are replaced with "?"
,even if you invoke the console using the unicode switch (cmd.exe /u),change the codepage to 65001 which is utf8 on windows
and use lucida console true type font which supports unicode.
A workaround is to use the com facilities provided by windows (in this case Scripting.FileSystemObject) which provide a much higher level of abstraction
or use the api as pointed out in this thread.
Based on your query as an initiative I tried to read a file with japanese characters in the filename which resides in the current folder and then move the file to another folder.
The filename is "は群馬県高崎市を拠点に、様々なメディ.txt"
Just create a new file and copy/paste this as a filename.(I don't know what it means,I just googled for 'japanese' and this turned up!so don't flame me if it means something bad!!)
and you have to have the appropriate fonts. Since opendir ,readdir,rename etc do not support unicode you have to reside to the Scripting.FileSystemObject methods and properties which accept unicode.
This is the actual code :
use Win32::OLE qw(in); use Devel::Peek; #CP_UTF8 is very important as it translates between Perl strings and U +nicode strings used by the OLE interface Win32::OLE->Option(CP => Win32::OLE::CP_UTF8); $obj = Win32::OLE->new('Scripting.FileSystemObject'); $folder = $obj->GetFolder("."); $collection= $folder->{Files}; mkdir ("c:\\newfolder")||die; foreach $value (in $collection) { $filename= %$value->{Name}; next if ($filename !~ /.txt/); Dump("$filename"); #check if the utf8 flag is on $file=$obj->GetFile("$filename"); $file->Move("c:\\newfolder\\$filename"); print (Win32::OLE->LastError() || "success\n\n"); }

What puzzles me is that you say that don't see the correct filename using explorer when you should have.
This will only work if you have the asian languages (regional setings) support enabled and you should be able to see the japanase name in explorer as above

Replies are listed 'Best First'.
Re^2: Opening files with japanese/chinese chars in filename
by Anonymous Monk on Apr 22, 2008 at 08:12 UTC
    Thanks for your tip, was searching the same question online and find this page.
    To the above code, I get $file->Move work with CJK filename, the problem I have here is if $path is a path contain utf8 $folder = $obj->GetFolder($path);
    does not seems to work, while if it is in Big5 it works then...

    Any suggestion? Thanks!

      Can you be more specific? Maybe provide a code sample and point out where the actual problem is?

      Try the following script which just gets all subdirectories and prints their name out.
      Does it work for the directory in question?

      use Win32::OLE qw(in); Win32::OLE->Option(CP => Win32::OLE::CP_UTF8); $obj = Win32::OLE->new('Scripting.FileSystemObject'); $folder = $obj->GetFolder("."); $collection= $folder->{SubFolders}; foreach $value (in $collection) { $foldername= %$value->{Name}; $folder=$obj->GetFolder("$foldername"); print (Win32::FormatMessage(Win32::OLE->LastError())|| "$foldernam +e"); }