Re^5: top ten things every Perl hacker should know

Try to come up with an example. With real code. You'll find that the straightforward sort really is more straightforward.

You'll find time after time again that the straightforward sort block is simpler to write in Perl than the fancy sorts. OK, thinking carefully about it, there is one exception. And that exception is where the code to extract "what you want to sort by" is very complex, so that it is more complex to do it both for $a and $b than it is to do it once and have a Schwartzian Transform. But I don't think I've ever encountered that in real life. (Plus one can just move the complex logic into a function and call the function twice. With anonymous functions one can do it inline, and it will still be simpler than a Schwartzian Transform.)

And, of course, someone who hasn't studied sorting tricks is always going to find the straightforward sort block far easier to read.

However the sort block executes more times than mangle/extract blocks do in the Schwartzian Transform or the Guttman-Rosler Transform. So the more work you move from the sort blocks to mangle/extract, the more time you'll save. The GRT is faster than the Schwartzian Transform because it uses a simpler data structure (a string), and so the sort block can be made even faster (in fact it is the default string compare).

People think that this is cool because they are surprised that this change can have such big performance implications. But it is an optimization, and the code you get is more complex (at least in Perl).

Update: hv noted that I'd written GST instead of GRT. Fixed.

Comment on Re^5: top ten things every Perl hacker should know

Replies are listed 'Best First'.
Re^6: top ten things every Perl hacker should know by johngg (Canon) on Mar 17, 2006 at 19:42 UTC
I'll take one of my ST sorts and translate it to "straightforward sort block" style and see what happens. I'll give you an update when it's done. Cheers, JohnGG	[reply]
Re^7: top ten things every Perl hacker should know by tilly (Archbishop) on Mar 17, 2006 at 22:44 UTC
You can look at Re^5: top ten things every Perl hacker should know for an example of the same sort written as a Schwartzian Transform and as a regular sort. (I intentionally chose one that was similar to the example that gave the Schwartzian transform its name.)	[reply]
Re^8: top ten things every Perl hacker should know by johngg (Canon) on Mar 18, 2006 at 12:07 UTC
No, you are right and I am wrong! If the data you need for sorting is there in front of you with no need for any "mangling", the "straightforward sort block" is simpler to write than the Schwartzian Transform. I was confusing the tranforming done as a performance measure with other transformations to the data that are necessary because it is not yet in a sortable form. If you do need to change the data in some way before the sort can take place then the Schwartzian Transform starts to gain in the straighforwardness factor, I think. The sort block method gains in readability because you can use meaningful variable names rather than $_; the Schwartzian Transform gains because the flow seems to me to be more obvious, albeit up the page which is counter-intuitive if you are used to piping commands in the shell. I tend to comment what is going on in the transform for those that are not familiar although I haven't below. The examples I give here sort the output of df -k by controller, target, device and slice. First, the "straightforward sort block". `open DF, "df -k \|" or die "$!\n"; print scalar <DF>; while(<DF>) { next unless m{^/dev/dsk/c(\d+)t(\d+)d(\d+)s(\d+)}; push @lines, [$_, $1, $2, $3, $4]; } close DF; @sorted = sort { $a->[1] <=> $b->[1] \|\| $a->[2] <=> $b->[2] \|\| $a->[3] <=> $b->[3] \|\| $a->[4] <=> $b->[4]} @lines; print $_->[0] for @sorted;` [download] Now the Schwartzian Transform. `open DF, "df -k \|" or die "$!\n"; print scalar <DF>; print map {$_->[0]} sort { $a->[1] <=> $b->[1] \|\| $a->[2] <=> $b->[2] \|\| $a->[3] <=> $b->[3] \|\| $a->[4] <=> $b->[4] } grep {defined $_->[1]} map {[ $_, m{^/dev/dsk/c(\d+)t(\d+)d(\d+)s(\d+)} ]} <DF>; close DF;` [download] I think there are arguments in favour of either method. I have got used to Schwartian Transforms so I automatically reach for those when sorting problems come up. Cheers, JohnGG	[reply] [d/l] [select]