in reply to readdir UTF8
in thread Unicode (ä, ö, ü in German) Problem with File::Find under Windows2000

Thank you BrowserUk! Thank you Thelonius!
I think I got more than I hoped.

because of some other trouble, i can try a little more since this morning.
Thelonius,I 've tried your code and I think something will happen in File::Find::finddepth before you fix it.
I used XML to get UTF8 string (similar to my old program).
config.xml

<?xml version="1.0" encoding="UTF-8" ?> <config> <srcdir>d:\temp\source\test2</srcdir> <dstdir>d:\temp\source\test5</dstdir> </config>

newcopy6.pl
#!d:\perl\bin\perl.exe -w use File::Find; use strict; use Encode qw(encode_utf8 decode_utf8 is_utf8); use XML::Simple; my $configfile=".\\config.xml"; my $config=XMLin($configfile); my $srcdir="d:\\temp\\source\\test2"; print "\$srcdir: $srcdir\n"; if(is_utf8($srcdir)){ print "is utf8\n"; }else{ print "is NOT utf8\n"; $srcdir=decode_utf8($srcdir); # ??? } # line "!!!" get srcdir from xml # or you can comment it to test # wether line "???" take any effect or not $srcdir=$$config{'srcdir'}; # !!! if(is_utf8($srcdir)){ print "is utf8\n"; }else{ print "is NOT utf8\n"; } { local ${^WIDE_SYSTEM_CALLS} = 1; finddepth( \&showme, $srcdir ); } sub fixutf8 { for (@_) { if (${^WIDE_SYSTEM_CALLS} && !is_utf8($_)) { $_ = decode_utf8($_); } } } sub showme { print "\$_ = $_\n"; fixutf8($File::Find::dir,$File::Find::name,$_); print "\$_ = $_\n"; }
And I got results in Dos but It's NOT depth first!
D:\temp\source>newcopy6.pl $srcdir: d:\temp\source\test2 is NOT utf8 is utf8 Can't cd to (d:\temp\source\test2/) &#9500;â&#9516;&#9570;a: No such f +ile or directory at D:\temp\source\newcopy6.pl line 28 $_ = &#9500;&#9570;a $_ = &#9500;&#9570;a $_ = . $_ = .
and in Komodo
$srcdir: d:\temp\source\test2 is NOT utf8 is utf8 $_ = öa $_ = öa $_ = . $_ = .
and if I comment "!!!" , i got in Komodo
Line "???" takes NO effect, but it's depth first
$srcdir: d:\temp\source\test2 is NOT utf8 is NOT utf8 $_ = ü.txt $_ = &#52212;xt $_ = öa $_ = &#30797; $_ = . $_ = .
(I've set UTF8 as editor encoding in Komodo's Preference, some character can't be posted here correctly, see Note from BrowserUK)
Then I've tested the fixutf8.
sub fixutf8 { for (@_) { print "\$_=$_"; if (${^WIDE_SYSTEM_CALLS} && !is_utf8($_)) { $_ = decode_utf8($_); } if(is_utf8($_)){ print "#\$_=$_ is utf8\n"; }else{ print "#\$_=$_ is NOT utf8\n"; } } }
then get
$srcdir: d:\temp\source\test2 is NOT utf8 is NOT utf8 $_ = ü.txt $_=d:\temp\source\test2/öa#$_=d:\temp\source\test2/&#30816;is utf8 $_=d:\temp\source\test2/öa/ü.txt#$_=d:\temp\source\test2/&#30831;&#522 +12;xt is utf8 $_=ü.txt#$_=&#52212;xt is utf8 $_ = &#52212;xt $_ = öa $_=d:\temp\source\test2#$_=d:\temp\source\test2 is NOT utf8 $_=d:\temp\source\test2/öa#$_=d:\temp\source\test2/&#30816;is utf8 $_=öa#$_=&#30816;is utf8 $_ = &#30797; $_ = . $_=d:\temp\source\test2#$_=d:\temp\source\test2 is NOT utf8 $_=d:\temp\source\test2#$_=d:\temp\source\test2 is NOT utf8 $_=.#$_=. is NOT utf8 $_ = .

So, I guess,
If I give the "finddepth" a UTF8 dirname,then it get a Ascii name of child node but can't handle them correctly like the first 2 results in Dos /komodo

If I give the "finddepth" a normal string with the program format, it has no problem to handle them just like last result.

finally I use the plain text als config file...
somehow disapointed.
But I still can't understand,
--Why the line "???" takes no effect?
--According to the Thelonius' Post , there is no function like getEncoding but what is the encoding in the Program?

btw. if you visit www.perl-community.de(where i also posted), you can see some other German-in-Win32 problems, for German in Dos there is a solution from Crian