I get files in Windows 7 that have wide chars. I need to rename them after stripping the non-ascii characters from them. Opendir and then readdir (in a while clause) does not work because readdir does bytes!

Below is code that I also tried.


# File name = a1.pl # Script to get rid of wide chars and non-ascii chars # in a Windows 7 file name. # This does not work. # In the Windows 7 file explorer, the file name shows as # "z &#8206;ay&#8206; &#8206;Pow&#8206;.mp4" # Note: There is at least one embedded wide char in the above # pasted file name. # The Windows 7 command promt shows the file name # as # "z ?ay? ?Pow?.mp4". use 5.14.2; # From: how to read unicode filename # http://www.perlmonks.org/?node_id=536223 open fList, '-|:encoding(UTF-16LE)', 'cmd /U /C dir /W'; # Note: I tried to opendir and readdir. I got the shortened # 8.3 character file name whenever a wide character # was in the file name. I could not rename. foreach (<fList>) { utf8::encode($_); my $orig_name = $_; my $new_name = $_; if ($new_name =~ m/.mp4/i) { print " 1 orig_name is \"$orig_name\"\n"; $new_name =~ s![^[:ascii:]]!!ig; print " 2 new name is \"$new_name\"\n"; rename "$orig_name", "$new_name"; # Does not work } } __END__ In the results below, note that the end double quotes are not at the end of the file name line! That should not be! >a1.pl 1 orig_name is "z &#915;ÇÄay&#915;ÇÄ &#915;ÇÄPow&#915;ÇÄ.mp4 " 2 new name is "z ay Pow.mp4 "

In reply to Rename Windows files with Unicode chars by mnooning

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.