Most Reverent Monks,

The included script reads a directory containing Latin-1 accented characters and displays a correctly sorted list on both Linux and Windows OS, but a few changes are needed:

- Linux : Uncomment 'use utf8::all', save with the default utf-8 encoding and run.

- Windows : Comment out 'use utf8::all' (line 8), save with the default iso-8859-1 or ANSI encoding, chcp 1252 on the command line and run.

To test accented characters, create a subdirectory named 'test' containing several files whose name start with normal uc and lc ascii characters and Latin-1 (Western Europe) accented characters (example: Drives, eval1, Eval2, éval3, Éval4, files, Übermensch, utilities). This is the sorted directory you'll get with ls (Linux) or dir (Windows), or with any graphical file and directory manager.

use utf8::all; # Comment out for Windows use Unicode::Collate; # No argument: current directory; com. line accepts dir. name. my $dir = ($ARGV[0] ? shift : '.'); opendir(my $dh, $dir) or die "\n\tCannot open directory : $!\n"; my @list = grep {!/^[\.]{1,2}$/} readdir $dh; #^ skips '.' and '..' print "$_\n" for @list; print "\tEnd unsorted\n\n"; my $collator = Unicode::Collate->new(level => 1); my @entries = $collator->sort(@list); print "$_\n" for (@entries); print "\tEnd sorted\n\n";

Looking for a simpler way, I added the following snippet, which doesn't work:

[...] use Config; use utf8::all if $Config{osname} eq 'Linux'; # perl adamantly ignores +the condition [...]

Further, perl cannot chcp on a Windows terminal.

My question : Is it possible to write a 'universal script' that would automatically detect the OS and act accordingly?

-0 - 0 - 0 - 0 - 0 - 0 - 0 - 0 - 0 - 0 - 0 - 0 - 0 - 0 - 0 - 0

Thank you so much, Monks!

With $^O, I get 'MSWin32' on my Windows 8 (64 bits) machine. So, just add the two following lines to my script:

use if $^O ne 'MSWin32', 'utf8::all'; system('chcp 1252') if $^O eq 'MSWin32';

Kludgy, but it does the job on both Linux and Windows, and possibly on Unix and Mac, too. If the user still gets funny characters, he has to manually save his file with the correct encoding, iso-8859-1 or ANSI for Windows or UTF-8 for most other OSes (untested). This is apparently the only thing that Perl cannot do for the unwary user!

'Confundant omnes , ultimus alienat'


In reply to Cross-platform accented character file names sorting by perlimpinpin

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.