I figured that I would just post this, since both dragonchild and I made the same mistake while executing a Schwartzian Transform. So we've got my mistake, dragonchild's mistake, and merlyn's correction to my mistake.

The issue that came up is that we were using the Schwartzian Transform in such a way that it isn't really a Schwartzian Transform anymore because of the performance drop we caused. The largest benefit of using this sorting technique in the first place is that it reduces the number of times we need to execute an expensive piece of code in order to complete our sorting. I do not have any experience with slightly advanced math, so I can't provide the figures to show how many times we save executing the expensive function for a list of n items, but even I know that we are saving a lot of work.

So just as a reminder to anyone who may not be quite up to date as to how to use the Schwartzian Transform to your advantage, here's a 'good and bad' way of doing it. This example actually has 2 places for improvement.

# we wish to sort the list of medalists, first by # medal placement, then alphabetically by name # we need a map in order to sort out the medal order my %map = ( gold => 1, silver => 2, bronze => 3 ); =for bad example # bad way - too much work being done in the sort! my @sorted = map { $_->[0] } sort { $map{$a->[2]} <=> $map{$b->[2]} or # very very lc($a->[1]) cmp lc($b->[1]) # evil!! } map { /^(([^:]+):\s+(\w+))$/ ? [$1,$2,$3] : () } <DATA>; =cut # good way - we only check the map hash once per item # and we only lc() the names once per item my @sorted = map { $_->[0] } sort { $a->[2] <=> $b->[2] or $a->[1] cmp $b->[1] } map { /^(([^:]+):\s+(\w+))$/ ? [$1, lc($2), $map{$3}] : () # very very good!! } <DATA>; print join("\n", @sorted); __DATA__ Sarah: silver Gloria: bronze Tracy: gold Shawn: silver Brad: gold Jack: bronze

So, the simple lesson here? Within a Schwartzian Transform, the only work being done within the sort() block is just that: the sort. Any data lookups or expensive checks on the data should be done within the first map that is executed, thus ensuring a much quicker sort on large data sets. It's quite the simple concept really, I can't help but wonder why the heck I placed the hash lookup within the sort! :)

Replies are listed 'Best First'.
Re: Malformed Schwartzian Transforms
by tilly (Archbishop) on Apr 09, 2004 at 22:49 UTC
    And here is a lesson that was lost on virtually everyone in the thread.

    The Schwartzian Transform is an optimization. Starting off writing a sort by by saying, "I'm going to write a Schwartzian Transform because it is slow" is premature optimization. Don't micro-optimize a step that is probably miniscule in the overall scheme of things.

    In the specific example that you give, the transform is justifiable not by performance, but by the fact that you avoid having to write parallel pre-processing logic for $a and $b. At which point there is a rationale for moving all of the complexity into the first map (put all of the data processing logic in one place), but it is a minor win at best.

      Indeed. I seldom need to care about optimization yet I often use Schwartzian Transforms as an OAOO technique.

      ihb

        OAOO

        <TPB> You keep using that word. I do not think it means what you think it means. </TPB>