Anyway, it turned out to be a matter of using ST

I don't think you need a Schwartzian Transform here.  An ST makes sense if the individual comparison operation is computationally expensive. This is not the case with interpreting a string as a number, in particular as the conversion is done only once for each string and then "cached" in the NV/IV fields of the scalar variable(*).  In other words, the simple approach (not using ST) is even faster in this case:

#!/usr/bin/perl use strict; use warnings; no warnings 'numeric'; use Benchmark 'cmpthese'; for my $e (2..5) { my $n = 10**$e; print "\nNumber of file names: $n\n"; my @data; push @data, join(".", int(rand($n)), int(rand($n)), 'force.0.5.1LG +Y.pdb') for 1..$n; cmpthese( 10**(6-$e), { 'simple' => sub { my @unsorted = @data; my @sorted = sort { $a <=> $b } @unsorted; }, 'ST' => sub { my @unsorted = @data; my @sorted = map $_->[0], sort { $a->[1] <=> $b->[1] } map { [ $_, int $_ ] } @unsorted; }, } ); } __END__ Number of file names: 100 Rate ST simple ST 3247/s -- -75% simple 12987/s 300% -- Number of file names: 1000 Rate ST simple ST 248/s -- -79% simple 1176/s 375% -- Number of file names: 10000 Rate ST simple ST 10.3/s -- -74% simple 39.2/s 280% -- Number of file names: 100000 s/iter ST simple ST 1.87 -- -50% simple 0.943 99% --

Another beneficial side effect of the simple approach is that if you happen to have two names like this

30.31.force.0.5.1LGY.pdb 30.32.force.0.5.1LGY.pdb

they would be ordered in some useful way, because the fractional part of the number is automatically taken into consideration when just treating the name as a number.


(*)

use Devel::Peek; my $s = "30.31.force.0.5.1LGY.pdb"; Dump $s; print 0+$s, "\n"; # treat as number Dump $s; __END__ SV = PV(0x605150) at 0x604fa0 REFCNT = 1 FLAGS = (PADBUSY,PADMY,POK,pPOK) PV = 0x6370d0 "30.31.force.0.5.1LGY.pdb"\0 CUR = 24 LEN = 32 30.31 SV = PVNV(0x607880) at 0x604fa0 REFCNT = 1 FLAGS = (PADBUSY,PADMY,NOK,POK,pIOK,pNOK,pPOK) IV = 30 <--- NV = 30.31 <--- PV = 0x6370d0 "30.31.force.0.5.1LGY.pdb"\0 CUR = 24 LEN = 32

In reply to Re^3: sorting an array of file names by almut
in thread sorting an array of file names by hotel

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.