I think what's going on here is a failure in file creation to recognize you are using wide characters. I could be wrong here, since the characters you list off should be supported by a normal open (code points 0xd6, 0xe5 and 0xc5). Unfortunately, it seems based on some research I've done (with some guidance from
tye in the CB) that this is not really trivial to deal with. You need to specifically tell Windows that you are trying to create files w/ wide characters in the names. To do this, you need to use the CreateFileW method from
Win32API::File, followed by OsFHandleOpen to retrieve the appropriate file handle. At that point, I believe you should be good to go. There are a good number of PerlMonks nodes on this, so check out
site:www.perlmonks.org createfilew. Of the resultant links, I found
Re^3: Saving file name with Chinese characters useful.
As a side note, your open statements aren't doing what you think: when you say
open MISSING, "+>>$targetDir/$k/$pr/$year/missing.txt" || die "could not open $targetDir/$k/$pr/$year/missing.txt";
the || binds tighter than the list operator (comma), so you will never test if the open succeeded. It's equivalent to
open MISSING, ("+>>$targetDir/$k/$pr/$year/missing.txt" || die "could not open $targetDir/$k/$pr/$year/missing.txt");
which always equivalent to
open MISSING, "+>>$targetDir/$k/$pr/$year/missing.txt";
Since "+>>$targetDir/$k/$pr/$year/missing.txt" evaluates to true. You either need to add parens or use the low-precedence or. See Burned by precedence rules and Operator Precedence and Associativity.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.