in reply to readdir UTF8
in thread Unicode (ä, ö, ü in German) Problem with File::Find under Windows2000
Thank you BrowserUk! Thank you Thelonius!
I think I got more than I hoped.
because of some other trouble, i can try a little more since this morning.
Thelonius,I 've tried your code and I think something will happen in File::Find::finddepth before you fix it.
I used XML to get UTF8 string (similar to my old program).
config.xml
<?xml version="1.0" encoding="UTF-8" ?> <config> <srcdir>d:\temp\source\test2</srcdir> <dstdir>d:\temp\source\test5</dstdir> </config>
And I got results in Dos but It's NOT depth first!#!d:\perl\bin\perl.exe -w use File::Find; use strict; use Encode qw(encode_utf8 decode_utf8 is_utf8); use XML::Simple; my $configfile=".\\config.xml"; my $config=XMLin($configfile); my $srcdir="d:\\temp\\source\\test2"; print "\$srcdir: $srcdir\n"; if(is_utf8($srcdir)){ print "is utf8\n"; }else{ print "is NOT utf8\n"; $srcdir=decode_utf8($srcdir); # ??? } # line "!!!" get srcdir from xml # or you can comment it to test # wether line "???" take any effect or not $srcdir=$$config{'srcdir'}; # !!! if(is_utf8($srcdir)){ print "is utf8\n"; }else{ print "is NOT utf8\n"; } { local ${^WIDE_SYSTEM_CALLS} = 1; finddepth( \&showme, $srcdir ); } sub fixutf8 { for (@_) { if (${^WIDE_SYSTEM_CALLS} && !is_utf8($_)) { $_ = decode_utf8($_); } } } sub showme { print "\$_ = $_\n"; fixutf8($File::Find::dir,$File::Find::name,$_); print "\$_ = $_\n"; }
and in KomodoD:\temp\source>newcopy6.pl $srcdir: d:\temp\source\test2 is NOT utf8 is utf8 Can't cd to (d:\temp\source\test2/) öa: No such f +ile or directory at D:\temp\source\newcopy6.pl line 28 $_ = öa $_ = öa $_ = . $_ = .
and if I comment "!!!" , i got in Komodo$srcdir: d:\temp\source\test2 is NOT utf8 is utf8 $_ = öa $_ = öa $_ = . $_ = .
(I've set UTF8 as editor encoding in Komodo's Preference, some character can't be posted here correctly, see Note from BrowserUK)$srcdir: d:\temp\source\test2 is NOT utf8 is NOT utf8 $_ = ü.txt $_ = 쯴xt $_ = öa $_ = 硍 $_ = . $_ = .
then getsub fixutf8 { for (@_) { print "\$_=$_"; if (${^WIDE_SYSTEM_CALLS} && !is_utf8($_)) { $_ = decode_utf8($_); } if(is_utf8($_)){ print "#\$_=$_ is utf8\n"; }else{ print "#\$_=$_ is NOT utf8\n"; } } }
$srcdir: d:\temp\source\test2 is NOT utf8 is NOT utf8 $_ = ü.txt $_=d:\temp\source\test2/öa#$_=d:\temp\source\test2/硠is utf8 $_=d:\temp\source\test2/öa/ü.txt#$_=d:\temp\source\test2/硯Ȋ +12;xt is utf8 $_=ü.txt#$_=쯴xt is utf8 $_ = 쯴xt $_ = öa $_=d:\temp\source\test2#$_=d:\temp\source\test2 is NOT utf8 $_=d:\temp\source\test2/öa#$_=d:\temp\source\test2/硠is utf8 $_=öa#$_=硠is utf8 $_ = 硍 $_ = . $_=d:\temp\source\test2#$_=d:\temp\source\test2 is NOT utf8 $_=d:\temp\source\test2#$_=d:\temp\source\test2 is NOT utf8 $_=.#$_=. is NOT utf8 $_ = .
So, I guess,
If I give the "finddepth" a UTF8 dirname,then it get a Ascii name of child node but can't handle them correctly like the first 2 results in Dos /komodo
If I give the "finddepth" a normal string with the program format, it has no problem to handle them just like last result.
finally I use the plain text als config file...
somehow disapointed.
But I still can't understand,
--Why the line "???" takes no effect?
--According to the Thelonius' Post , there is no function like getEncoding but what is the encoding in the Program?
btw. if you visit www.perl-community.de(where i also posted), you can see some other German-in-Win32 problems, for German in Dos there is a solution from Crian
|
|---|