I figured that I would just post this, since both dragonchild and I made the same mistake while executing a Schwartzian Transform. So we've got my mistake, dragonchild's mistake, and merlyn's correction to my mistake.

The issue that came up is that we were using the Schwartzian Transform in such a way that it isn't really a Schwartzian Transform anymore because of the performance drop we caused. The largest benefit of using this sorting technique in the first place is that it reduces the number of times we need to execute an expensive piece of code in order to complete our sorting. I do not have any experience with slightly advanced math, so I can't provide the figures to show how many times we save executing the expensive function for a list of n items, but even I know that we are saving a lot of work.

So just as a reminder to anyone who may not be quite up to date as to how to use the Schwartzian Transform to your advantage, here's a 'good and bad' way of doing it. This example actually has 2 places for improvement.

# we wish to sort the list of medalists, first by # medal placement, then alphabetically by name # we need a map in order to sort out the medal order my %map = ( gold => 1, silver => 2, bronze => 3 ); =for bad example # bad way - too much work being done in the sort! my @sorted = map { $_->[0] } sort { $map{$a->[2]} <=> $map{$b->[2]} or # very very lc($a->[1]) cmp lc($b->[1]) # evil!! } map { /^(([^:]+):\s+(\w+))$/ ? [$1,$2,$3] : () } <DATA>; =cut # good way - we only check the map hash once per item # and we only lc() the names once per item my @sorted = map { $_->[0] } sort { $a->[2] <=> $b->[2] or $a->[1] cmp $b->[1] } map { /^(([^:]+):\s+(\w+))$/ ? [$1, lc($2), $map{$3}] : () # very very good!! } <DATA>; print join("\n", @sorted); __DATA__ Sarah: silver Gloria: bronze Tracy: gold Shawn: silver Brad: gold Jack: bronze

So, the simple lesson here? Within a Schwartzian Transform, the only work being done within the sort() block is just that: the sort. Any data lookups or expensive checks on the data should be done within the first map that is executed, thus ensuring a much quicker sort on large data sets. It's quite the simple concept really, I can't help but wonder why the heck I placed the hash lookup within the sort! :)


In reply to Malformed Schwartzian Transforms by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.