Re^2: Sorting based on filesize.

You need the Schwartzian Transform

I'd say it's doubtful. Clearly, stat is slow, but it's not a slouch, and the Schwartz Transform adds its own overhead. I suspect the sweet spot where the ST pulls ahead is higher than you think.

If we want to pursue this approach, an improvement will be to employ a Guttman-Rosler Transform:

my @sorted =
    map substr($_, 4) =>
    sort
    map pack(`L',$_) . $_ => @files;
    # but see eyepopslikeamosquito's comment below
[download]

By being able to dip down and call the sort function without a code block will show greater benefits. With the proviso that one might have to pack with 8 byte 'Q' quads, for systems with filesizes larger than 2Gb.

With that in mind, it's probably all premature optimisation. I would wait until the snippet showed itself as being a performance pig before going to all this bother:

    sort { -s $a <=> -s $b } @files
[download]

... does have the distinct advantage of being eminently readable.

- another intruder with the mooring of the heat of the Perl

Comment on Re^2: Sorting based on filesize. Select or Download Code

Replies are listed 'Best First'.
Re^3: Sorting based on filesize. by eyepopslikeamosquito (Archbishop) on Jul 20, 2004 at 08:41 UTC
my @sorted = map substr($_, 4) => sort map pack(`L',$_) . $_ => @files; For improved portability, that pack('L',$_) above should read pack('N',$_) because 'L' is native while 'N' is network (big-endian). See also Re: Advanced Sorting - GRT - Guttman Rosler Transform.	[reply]
•Re^3: Sorting based on filesize. by merlyn (Sage) on Jul 20, 2004 at 13:20 UTC
I'd say it's doubtful. Clearly, stat is slow, but it's not a slouch, and the Schwartz Transform adds its own overhead. I suspect the sweet spot where the ST pulls ahead is higher than you think. Actually, this is the core of one of the alpaca exercises, and I forget the exact numbers, but the ST turns out to be faster after only 10 or 15 files in our investigation. Stat may not be doggy, but it's still a context switch and a lot of bit wrangling. Much better to do that only once per file. And (ahem) it's Schwartzian Transform not Schwartz Transform. If you're going to poo-poo it in favor of the more complex GRT, at least spell it right. {grin} -- Randal L. Schwartz, Perl hacker Be sure to read my standard disclaimer if this is a reply.	[reply]
Re^4: Sorting based on filesize. by pbeckingham (Parson) on Jul 21, 2004 at 00:46 UTC
Talking of ST versus GRT comparisons, Uri's new Sort::Maker can do ST, GRT, Orcish (OR-Cache) and plain old sort, and makes things very easy to benchmark and change between different algorithms. It builds sorting code on the fly. Very cool.	[reply]