Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: Wide characters in Windows filenames with File::Copy

by ikegami (Patriarch)
on Nov 20, 2023 at 19:31 UTC ( [id://11155725] : note . print w/replies, xml ) Need Help??


in reply to Wide characters in Windows filenames with File::Copy

Use Win32::LongPath's copyL instead.


The file names passed to copy need to be strings of bytes. In unix, you'd encode the text to use as the name using the system's locale. In Windows, you'd encode it using the encoding returned by "cp".Win32::GetACP().

The reason for this is that File::Copy's copy uses Win32's Win32::CopyFile, which exposes the CopyFileA system call. The "(A)NSI" system calls use the system's Active Code Page. The exception to this is when the program's manifest makes the programs Active Code Page 65001, UTF-8. I keep meaning to try this to change perl's Active Code Page to UTF-8. You'd still have to encode the file name, but with UTF-8 (or cp65001, the alias returned by the earlier snippet after this change).

On an English machine, the encoding is probably cp1252. Fortunately, the file name in question can be encoded using Windows-1252. If you wanted to support file names that can't be encoded using your system's ACP, you'd have to change the program's ACP as mentioned above, or you'd have to use CopyFileW, the "(W)ide" or "Unicode" version of the system call, which takes UTF-16le strings. Win32::LongPath's copyL exposes this call. (It also munges the paths to allow longer paths, but this is transparent.) It it wasn't already exposed, you could have used a module like FFI::Platypus or Win32::API to access it, or you could have written your own XS module.

Replies are listed 'Best First'.
Re^2: Wide characters in Windows filenames with File::Copy
by Jenda (Abbot) on Nov 21, 2023 at 14:47 UTC

    This is one of the things that should have been changed and long forgotten about two decades ago. There is no sane reason for File::Copy to use the ACP and there hadn't been for ages.

    Jenda
    1984 was supposed to be a warning,
    not a manual!

Re^2: Wide characters in Windows filenames with File::Copy
by slugger415 (Monk) on Nov 21, 2023 at 04:25 UTC

    thank you!!