Thank you BrowserUk! Thank you Thelonius!
I think I got more than I hoped.
because of some other trouble, i can try a little more since this morning.
Thelonius,I 've tried your code and I think something will happen in File::Find::finddepth before you fix it.
I used XML to get UTF8 string (similar to my old program).
config.xml
<?xml version="1.0" encoding="UTF-8" ?>
<config>
<srcdir>d:\temp\source\test2</srcdir>
<dstdir>d:\temp\source\test5</dstdir>
</config>
newcopy6.pl
#!d:\perl\bin\perl.exe -w
use File::Find;
use strict;
use Encode qw(encode_utf8 decode_utf8 is_utf8);
use XML::Simple;
my $configfile=".\\config.xml";
my $config=XMLin($configfile);
my $srcdir="d:\\temp\\source\\test2";
print "\$srcdir: $srcdir\n";
if(is_utf8($srcdir)){
print "is utf8\n";
}else{
print "is NOT utf8\n";
$srcdir=decode_utf8($srcdir); # ???
}
# line "!!!" get srcdir from xml
# or you can comment it to test
# wether line "???" take any effect or not
$srcdir=$$config{'srcdir'}; # !!!
if(is_utf8($srcdir)){
print "is utf8\n";
}else{
print "is NOT utf8\n";
}
{
local ${^WIDE_SYSTEM_CALLS} = 1;
finddepth( \&showme, $srcdir );
}
sub fixutf8 {
for (@_) {
if (${^WIDE_SYSTEM_CALLS} && !is_utf8($_)) {
$_ = decode_utf8($_);
}
}
}
sub showme {
print "\$_ = $_\n";
fixutf8($File::Find::dir,$File::Find::name,$_);
print "\$_ = $_\n";
}
And I got results in Dos but It's NOT depth first!
D:\temp\source>newcopy6.pl
$srcdir: d:\temp\source\test2
is NOT utf8
is utf8
Can't cd to (d:\temp\source\test2/) öa: No such f
+ile or directory
at D:\temp\source\newcopy6.pl line 28
$_ = ├╢a
$_ = ├╢a
$_ = .
$_ = .
and in Komodo
$srcdir: d:\temp\source\test2
is NOT utf8
is utf8
$_ = öa
$_ = öa
$_ = .
$_ = .
and if I comment "!!!" , i got in Komodo
Line "???" takes NO effect, but it's depth first
$srcdir: d:\temp\source\test2
is NOT utf8
is NOT utf8
$_ = ü.txt
$_ = 쯴xt
$_ = öa
$_ = 硍
$_ = .
$_ = .
(I've set UTF8 as editor encoding in Komodo's Preference, some character can't be posted here correctly, see Note from BrowserUK)
Then I've tested the fixutf8.
sub fixutf8 {
for (@_) {
print "\$_=$_";
if (${^WIDE_SYSTEM_CALLS} && !is_utf8($_)) {
$_ = decode_utf8($_);
}
if(is_utf8($_)){
print "#\$_=$_ is utf8\n";
}else{
print "#\$_=$_ is NOT utf8\n";
}
}
}
then get
$srcdir: d:\temp\source\test2
is NOT utf8
is NOT utf8
$_ = ü.txt
$_=d:\temp\source\test2/öa#$_=d:\temp\source\test2/硠is utf8
$_=d:\temp\source\test2/öa/ü.txt#$_=d:\temp\source\test2/硯Ȋ
+12;xt is utf8
$_=ü.txt#$_=쯴xt is utf8
$_ = 쯴xt
$_ = öa
$_=d:\temp\source\test2#$_=d:\temp\source\test2 is NOT utf8
$_=d:\temp\source\test2/öa#$_=d:\temp\source\test2/硠is utf8
$_=öa#$_=硠is utf8
$_ = 硍
$_ = .
$_=d:\temp\source\test2#$_=d:\temp\source\test2 is NOT utf8
$_=d:\temp\source\test2#$_=d:\temp\source\test2 is NOT utf8
$_=.#$_=. is NOT utf8
$_ = .
So, I guess,
If I give the "finddepth" a UTF8 dirname,then it get a Ascii name of child node but can't handle them correctly like the first 2 results in Dos /komodo
If I give the "finddepth" a normal string with the program format, it has no problem to handle them just like last result.
finally I use the plain text als config file...
somehow disapointed.
But I still can't understand,
--Why the line "???" takes no effect?
--According to the Thelonius' Post , there is no function like getEncoding but what is the encoding in the Program?
btw. if you visit www.perl-community.de(where i also posted), you can see some other German-in-Win32 problems, for German in Dos there is a solution from Crian |