Perl 5.8 is capable of using UTF8 internally, but it can't always tell if a string of octets is UTF8, latin1, big5, ..., or just some random binary data.

It looks like readdir is returning a latin1 encoding of the name. This leaves the interesting question of what it would do with some name that isn't representable in latin1.

As BrowserUK indicated, the error indicates that the string in your source file is not encoded as a utf8 string. If your editor is capable of using utf8, you can still use it in your program. For example, in vim ":set encoding=utf8". That will work, but may not convert pre-existing non-ascii characters.

I tried a test with File::Find, and finddepth seemed to work okay.

If you want to display non-ASCII data in a DOS box, you need to convert it to the correct code page. Here's an example program:

#!perl -w use Encode; use utf8; my $test = "This is a test. Gödel"; my $cp = `chcp`; # get code page from DOS CHCP command if ($cp =~ /(\d+)/) { $cp = "cp$1"; } else { $cp = "cp437"; } binmode STDOUT, ":encoding($cp)" or die "Error on binmode: $!"; print STDOUT "$test\n";

In reply to Re: Unicode (ä, ö, ü in German) Problem with File::Find under Windows2000 by Thelonius
in thread Unicode (ä, ö, ü in German) Problem with File::Find under Windows2000 by TeddyC

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.