Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
If I remember correctly, the name was coined by Joseph Hall, who co-wrote Effective Perl Programming with Randal Schwartz. As has been mentioned here, the primary reason for the transform is efficiency. Computing the sort term first and eliminating assignments to temporary variables via the list processing features of Perl turns out to yield substantial savings. Here's an example from something I just worked on. I needed to write some code to group the sale prices of recently sold homes by $100k-199k, $200k-299k, etc. and then sort them. To group the prices, instead of using a range or if ($x->{SP} >= 100 and $x->{SP} < 200) {...} elseif ($x >= 200 and $x < 300) (...) etc, I just computed int($home->{sp}/100)*100. Now 128 and 192 become 100, 202 and 246 become 200, etc. There was more to it, but this is an example.

Now, on the face of it, the sorting would look something like

@sorted = sort { int($a->{SP}/100)*100 <=> int($b->{SP}/100)*100 } @unsorted;
The problem is that when you sort 100 items, the number of comparisons made is on the order of N**2 (if I remember correctly). Thus, sorting 1000 items requires a million comparisons, which requires a million instances of dereferencing, doing some math, lopping off the decimals, etc. With more complicated sort terms, it can get quite hairy.

So for efficiency, the ST creates an array of two-element lists of the form

( [$sort_term, $ref_to_orig-data], [$sort_term, $ref_to_orig-data], etc)

Then you just sort the whole thing once. The trick is to do it without temporary variables. This is where map can be so useful. Read it from the bottom up.

@sorted_refs = map { $_->[1] } sort { $a->[0] <=> $b->[0] } map { [ int($_->{SP}/100)*100, $_] } @unsorted_refs;

You can also do this with additional terms, such as sorting within groupings.

@sorted_refs = map { $_->[2] } sort { $a->[0] <=> $b->[0] || $a->[1] <=> $b->[1] } map { [ int($_->{SP}/100)*100, DateToTmStr($_->{SaleDate}), $_] } @unsorted_refs;

I read an article a few years ago which takes this concept further and recommends, for certain data, concatenating together the search term, a connector of some kind, and the original data as a single string. By eliminating the dereferencing, you can save quite a bit of time; though this only works if you have data that can be serialized without adding even more work than you save.


In reply to Re: What is "Schwartzian Transform" by furry_marmot
in thread What is "Schwarzian Transform" (aka Schwartzian) by GrandFather

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (6)
As of 2024-04-23 17:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found