A good strategy for you would be either to start splitting your code into smaller and smaller parts until some of them start working - and seeing which change made a difference - or combining small self-contained reproducible examples we provide back into a whole that resembles your current code - and seeing when it stops working.

Right now, your encoding handling is doing the wrong thing. Let's start with a small file and get it to output UTF-8 from Perl wide characters:

use warnings; binmode STDOUT, ":utf8"; print "\x{44b}\n";

No warnings, the string literal is definitely wide, and the output is evidently UTF-8. This was achieved by adding a perl IO layer to STDOUT that encodes wide characters to UTF-8 bytes. We can verify that:

use Data::Dumper; binmode STDOUT, ":utf8"; use PerlIO; print Dumper [ PerlIO::get_layers \*STDOUT, output => 1 ]; __END__ $VAR1 = [ 'unix', 'perlio', 'utf8' ];

Your code,

use open qw/:std :utf8/; use open OUT => ':encoding(UTF-8)', ':std';
adds the UTF-8 encoding layer multiple times:
$VAR1 = [ 'unix', 'perlio', 'utf8', 'encoding(utf-8-strict)', 'utf8' ];

That would be one of the reasons why you are getting Mojibake instead of Cyrillic characters. It may be helpful to use more simple and explicit code for now, until you understand better the machinery that makes it all tick. Start with binmode STDOUT, ":utf8" and get your code to output correctly-encoded UTF-8 to STDOUT (after reading your code, I think you are almost there: everywhere you get UTF-8 bytes, you decode them correctly before printing). Once that works, start adding pragmas like open that save you typing.

I am not sure why would your code (appear to) entirely skip non-ASCII files and directories, but perhaps we could shed some light on it once we get Unicode display problem resoled.


In reply to Re^3: help with cyrillic characters in odd places by Anonymous Monk
in thread help with cyrillic characters in odd places by Aldebaran

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.