Most Reverent Monks,
The included script reads a directory containing Latin-1 accented characters and displays a correctly sorted list on both Linux and Windows OS, but a few changes are needed:
- Linux : Uncomment 'use utf8::all', save with the default utf-8 encoding and run.
- Windows : Comment out 'use utf8::all' (line 8), save with the default iso-8859-1 or ANSI encoding, chcp 1252 on the command line and run.
To test accented characters, create a subdirectory named 'test' containing several files whose name start with normal uc and lc ascii characters and Latin-1 (Western Europe) accented characters (example: Drives, eval1, Eval2, éval3, Éval4, files, Übermensch, utilities). This is the sorted directory you'll get with ls (Linux) or dir (Windows), or with any graphical file and directory manager.
use utf8::all; # Comment out for Windows use Unicode::Collate; # No argument: current directory; com. line accepts dir. name. my $dir = ($ARGV[0] ? shift : '.'); opendir(my $dh, $dir) or die "\n\tCannot open directory : $!\n"; my @list = grep {!/^[\.]{1,2}$/} readdir $dh; #^ skips '.' and '..' print "$_\n" for @list; print "\tEnd unsorted\n\n"; my $collator = Unicode::Collate->new(level => 1); my @entries = $collator->sort(@list); print "$_\n" for (@entries); print "\tEnd sorted\n\n";
Looking for a simpler way, I added the following snippet, which doesn't work:
[...] use Config; use utf8::all if $Config{osname} eq 'Linux'; # perl adamantly ignores +the condition [...]
Further, perl cannot chcp on a Windows terminal.
My question : Is it possible to write a 'universal script' that would automatically detect the OS and act accordingly?
-0 - 0 - 0 - 0 - 0 - 0 - 0 - 0 - 0 - 0 - 0 - 0 - 0 - 0 - 0 - 0Thank you so much, Monks!
With $^O, I get 'MSWin32' on my Windows 8 (64 bits) machine. So, just add the two following lines to my script:
use if $^O ne 'MSWin32', 'utf8::all'; system('chcp 1252') if $^O eq 'MSWin32';
Kludgy, but it does the job on both Linux and Windows, and possibly on Unix and Mac, too. If the user still gets funny characters, he has to manually save his file with the correct encoding, iso-8859-1 or ANSI for Windows or UTF-8 for most other OSes (untested). This is apparently the only thing that Perl cannot do for the unwary user!
'Confundant omnes , ultimus alienat'
In reply to Cross-platform accented character file names sorting by perlimpinpin
For: | Use: | ||
& | & | ||
< | < | ||
> | > | ||
[ | [ | ||
] | ] |