I figured that I would just post this, since both dragonchild and I made the same mistake while executing a Schwartzian Transform. So we've got my mistake, dragonchild's mistake, and merlyn's correction to my mistake.
The issue that came up is that we were using the Schwartzian Transform in such a way that it isn't really a Schwartzian Transform anymore because of the performance drop we caused. The largest benefit of using this sorting technique in the first place is that it reduces the number of times we need to execute an expensive piece of code in order to complete our sorting. I do not have any experience with slightly advanced math, so I can't provide the figures to show how many times we save executing the expensive function for a list of n items, but even I know that we are saving a lot of work.
So just as a reminder to anyone who may not be quite up to date as to how to use the Schwartzian Transform to your advantage, here's a 'good and bad' way of doing it. This example actually has 2 places for improvement.
# we wish to sort the list of medalists, first by # medal placement, then alphabetically by name # we need a map in order to sort out the medal order my %map = ( gold => 1, silver => 2, bronze => 3 ); =for bad example # bad way - too much work being done in the sort! my @sorted = map { $_->[0] } sort { $map{$a->[2]} <=> $map{$b->[2]} or # very very lc($a->[1]) cmp lc($b->[1]) # evil!! } map { /^(([^:]+):\s+(\w+))$/ ? [$1,$2,$3] : () } <DATA>; =cut # good way - we only check the map hash once per item # and we only lc() the names once per item my @sorted = map { $_->[0] } sort { $a->[2] <=> $b->[2] or $a->[1] cmp $b->[1] } map { /^(([^:]+):\s+(\w+))$/ ? [$1, lc($2), $map{$3}] : () # very very good!! } <DATA>; print join("\n", @sorted); __DATA__ Sarah: silver Gloria: bronze Tracy: gold Shawn: silver Brad: gold Jack: bronze
So, the simple lesson here? Within a Schwartzian Transform, the only work being done within the sort() block is just that: the sort. Any data lookups or expensive checks on the data should be done within the first map that is executed, thus ensuring a much quicker sort on large data sets. It's quite the simple concept really, I can't help but wonder why the heck I placed the hash lookup within the sort! :)
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Malformed Schwartzian Transforms
by tilly (Archbishop) on Apr 09, 2004 at 22:49 UTC | |
by ihb (Deacon) on Apr 10, 2004 at 18:05 UTC | |
by flyingmoose (Priest) on Apr 12, 2004 at 16:43 UTC |