cr8josh has asked for the wisdom of the Perl Monks concerning the following question:

Greetings. I've been looking at previous threads but can't find anything about my particular problem.

I need to sort an array of filenames, to match how Windows would sort them in Explorer.

Using a "natural" sort gets me extremely close, with one discrepancy. I've tried both Sort::Naturally and Sort::Key::Natural.

Those modules sort these two things as follow:

HC_TalkRoar_1b

HC_TalkRoar_1

Windows sorts them as follows:

HC_TalkRoar_1

HC_TalkRoar_1b

I can't quite believe I'm having such a hard time finding something to reproduce Windows file-sorting order, and would greatly appreciate any assistance!

Thanks!

Replies are listed 'Best First'.
Re: Natural Sort / Windows sort problem
by ikegami (Patriarch) on Oct 30, 2017 at 17:31 UTC

    You are mistaken. While neither guarantee the order of Windows Explorer, they do not produce the order you claim they do.

    $ perl -e' use feature qw( say ); use Sort::Naturally qw( nsort ); say for nsort qw( HC_TalkRoar_1 HC_TalkRoar_1b ); ' HC_TalkRoar_1 HC_TalkRoar_1b
    $ perl -e' use feature qw( say ); use Sort::Key::Natural qw( natsort ); say for natsort qw( HC_TalkRoar_1 HC_TalkRoar_1b ); ' HC_TalkRoar_1 HC_TalkRoar_1b
      Thank you, you helped me find the issue which was not the sorting. Much appreciated! The issue was because of my file extensions. If you add .wav to the end of those two strings, they sort backwards. Filtering out the file extensions solves it.
Re: Natural Sort / Windows sort problem
by huck (Prior) on Oct 30, 2017 at 17:59 UTC

    use strict; use warnings; use File::Find; my @finds=('D:/goodies/pdhuck/down1/perl/monks/winsorts'); my @fns=(); my $sub_want_info2 =sub { push @fns,$File::Find::name if -f $File::F +ind::name; }; find({wanted=>$sub_want_info2,no_chdir=>1}, @finds); use Encode qw/encode decode from_to/; ##http://code.activestate.com/lists/activeperl/21182/ ##https://gist.github.com/whitebell/5692575 my %fnt; for my $fn (@fns) {$fnt{encode('UTF-16LE', decode('UTF-8', $fn))}= +$fn;} use Win32::API; my $f = Win32::API->new("shlwapi.dll", "StrCmpLogicalW", "PP","I") +; my @fnts=sort { $f->Call($a, $b) } keys %fnt; @fns=(); for my $fntt (@fnts){ push @fns,$fnt{$fntt};} for my $fn (@fns){ print $fn."\n";}
    result
    D:/goodies/pdhuck/down1/perl/monks/winsorts/a1 D:/goodies/pdhuck/down1/perl/monks/winsorts/u2 D:/goodies/pdhuck/down1/perl/monks/winsorts/u100 D:/goodies/pdhuck/down1/perl/monks/winsorts/u200

      Thanks! That solved other issues I was having! Now I just have to figure out what it's doing!!
Re: Natural Sort / Windows sort problem
by salva (Canon) on Oct 31, 2017 at 07:53 UTC
    IIRC, Sort::Naturally implemented some tricks to be more like a file explorer, while Sort::Key::Natural is more pure.

    If you want the file extension to be treated as an independent secondary sorting key, you can use Sort::Key::Maker for creating a multi-key sorting function:

    use Sort::Key::Natural; use Sort::Key::Maker sort_filenames => sub { /^(.*?)((?:\.[^\.]*)?)$/ +}, qw(natural natural); my @filenames = qw(...); my @sorted = sort_filenames @filenames;

      Thank you!

      I found something I don't understand, which is forcing me to use the Win32 call method described above. When trying to do a case-insensitive sort by using {uc $_} or {lc $_}, I get different and unexpected results. Here is an example using the same string sets, one upper case, one not.

      use feature qw( say ); use Sort::Key::Natural qw( natsort ); say for natsort qw( P007B_YUMYUMNOISESANDTASTING1 P007_YUMYUMNO +ISESANDTASTING5A P007_YUMYUMNOISESANDTASTING5B 007_YUMYUMNOISESANDTAS +TING5C ); say for natsort qw( P007b_YumYumNoisesAndTasting1 P007_YumYumNoises +AndTasting5A P007_YumYumNoisesAndTasting5B P007_YumYumNoisesAndTastin +g5C );

      Output:

      P007B_YUMYUMNOISESANDTASTING1 P007_YUMYUMNOISESANDTASTING5A P007_YUMYUMNOISESANDTASTING5B P007_YUMYUMNOISESANDTASTING5C P007_YumYumNoisesAndTasting5A P007_YumYumNoisesAndTasting5B P007_YumYumNoisesAndTasting5C P007b_YumYumNoisesAndTasting1

      In the first case, the "P007B_" sorts before "P007_". In the second case it sorts after. If I do all lc, it's the same thing, 7b sorts to the beginning instead of the end:

      P007b_yumyumnoisesandtasting1 P007_yumyumnoisesandtasting5a P007_yumyumnoisesandtasting5b P007_yumyumnoisesandtasting5c

      Anyone understand why? I need it to sort after, but be case insensitive...

      Thanks!