in reply to Rename Windows files with Unicode chars

For unicode filenames on windows you need to use Win32::Unicode

update: Example

#!/usr/bin/perl -- use strict; use warnings; use Win32::Unicode -native; listDir(); open my($fh), '>:encoding(UTF-8)', qq{I-\x{2665}-Perl} or die $!; print $fh qq{I-\x{2665}-Perl}; close $fh; listDir(); rename qq{I-\x{2665}-Perl}, 'I-love-Perl'; listDir(); unlink 'I-love-Perl'; sub listDir { my( $dir ) = grep defined, @_, '.'; my $wdir = Win32::Unicode::Dir->new( ); $wdir->open($dir) or die $!; for ($wdir->fetch) { next if /^\.{1,2}$/; my $full_path = "$dir/$_"; if (file_type('f', $full_path)) { print "f $_\n"; } elsif (file_type('d', $full_path)){ print "d $_\n"; } } $wdir->close or die $!; print "\n####\n\n"; } __END__
$ chcp 65001
Active code page: 65001

$ perl win32-unicode-native-to-ascii.pl
f win32-unicode-native-to-ascii.pl

####

f I-♥-Perl
f win32-unicode-native-to-ascii.pl

####

f I-love-Perl
f win32-unicode-native-to-ascii.pl

####


$

Replies are listed 'Best First'.
Re^2: Rename Windows files with Unicode chars
by mnooning (Beadle) on Aug 27, 2016 at 03:35 UTC

    The line

    rename qq{I-\x{2665}-Perl}, 'I-love-Perl';

    is cheating. You were able to type in the name of the file that was to be renamed because you already knew what the original file name was. The problem is that I will never have such knowledge beforehand.

    Can you show code that will read a file name in the given directory, complete with the file's non-ascii characters, save the original file name in some variable, then strip the non-ascii from the said file name, then rename the file (using the original, saved file name) to the new all-ascii file name?

      Hi,

      Did you forget about  sub listDir { ? How does listDir cheat at reading unicode filenames?

      :)

        I needed to have the file name as read by listDir to be what is used during the renaming. To test if the concept worked I stripped the directory of all but your script. I then created two globals, $global_file_name and $new_name. Your code created I-unicode heart-Perl. When listDir read in the file name I saved it in $global_file_name. Just prior to the rename I saved $global_file_name into $new_name. Just before the rename line I did


        $new_name =~ s![^:ascii:]!!ig;

        I then changed the rename line to

        rename $global_file_name, $new_name;

        It worked. I now know I can use this concept to strip out the unicode from a large number of files. Thank you very much for your help

Re^2: Rename Windows files with Unicode chars
by Anonymous Monk on Sep 01, 2016 at 17:08 UTC
    Another option is with the COM interface,more here:
    Unicode issues in Perl (from a windows perspective)
    www.i-programmer.info/programming/other-languages/1973-unicode-issues-in-perl.html
      thanks for the mention. The article goes to lengths in describing the underlying encoding issues and how to deal with them,but for the OP's purpose the following code snippet extracted from the article should do it
      use Win32::Console; Win32::Console::OutputCP( 65001 ); use Devel::Peek; use Win32::OLE qw(in); binmode(STDOUT, ":utf8"); Win32::OLE->Option(CP => Win32::OLE::CP_UTF8); $obj = Win32::OLE-> new('Scripting.FileSystemObject'); $folder = $obj->GetFolder("."); $collection= $folder->{Files}; foreach $value (in $collection) { $filename= %$value->{Name}; next if ($filename !~ /.rar/i); print $filename,"\n"; Dump $filename,"\n"; }
      I haven't benchmarked it but logically the Win32 API calls should be faster than calling into the COM,but nevertheless COM exposes FileSystemObjects methods which might be convenient anyway
Re^2: Rename Windows files with Unicode chars
by dpoppi (Initiate) on Jan 18, 2017 at 07:22 UTC
    Win32::Unicode does not compile under 5.24 (mswin32), so I cannot use that code.

      Hi,

      Win32::Unicode does not compile under 5.24 (mswin32),

      Are you sure about that?

      Looking at CPAN Testers Reports: Report for Win32-Unicode-0.38 all I see is a failing test -- meaning the module did compile and can be installed

      so I cannot use that code.

      Maybe you can use an older version of perl ?