Improving sorts using ST

sweetblood has asked for the wisdom of the Perl Monks concerning the following question:

I've on several occasions read about the Schwartzian Transform and I've tried it in small apps with simple sorts, with limited success. The advantage in it, I understand is being able to do more complex sorting as well as improved performance. But to apply ST to more complicated sorts has escaped me. I use a sub I wrote some time ago that I can pass an array ref(containing delimited records), a delimiter, the field I wish to sort on and a flag toggling whether or not I want unique output. In merlyn's tutorial he explains the inefficiency of doing splits inside your sort routines(this make great sense), I rely on splits heavily in this sub. I have included my sub for reference and wonder if I might get some suggestions about improving it’s performance using ST and also adding the ability to sort on more than one field.

Here’s my code:

sub sort_array_records {
    # sort_array_records takes a reference to an array such as \@myarr
+ay as
    # the 1st arg followed by a delimeter, a key (numeric index) and a
    # boolean uniq flag if a unique array is expected in return.
    # sort_array_records returns the array in list context. If the key
+ has
    # an "n" appended to it the sort is done numerically. An example o
+f
    # sort_array_records is:
    # @uniq_users = sort_array_records (\@passswd, ":", 0, 1);
    my ($array_2_sort_ref, $delim, $key, $uniq) = @_;
    my $numeric = 0;
    my @sorted_array = ();
    if ($key =~ /\d*n/) {
        $numeric = 1;
        $key =~ s/(^\d*)n/$1/;
    }
    if ($numeric) {
        @sorted_array = sort { (split /$delim/, $a)[$key] <=> (split /
+$delim/, $b)[$key] } @$array_2_sort_ref;
    } else {
        @sorted_array = sort { (split /$delim/, $a)[$key] cmp (split /
+$delim/, $b)[$key] } @$array_2_sort_ref;
    }
    if ($uniq) {
        my $prev = lc($sorted_array[$key]);
        my @uniq_elems = grep( lc((split /$delim/, $_)[$key]) ne $prev
+ && ($prev
 = lc((split /$delim/, $_)[$key]), 1), @sorted_array);
        return @uniq_elems;
    } else {
       return @sorted_array;
    }
}
[download]

Incidentally, I did not write this sub from scratch but more expanded on ideas I found in the Perl Cookbook ISBN:ISBN 1565922433 and here at Perlmonks.

TIA

Sweetblood

Comment on Improving sorts using ST Download Code

Replies are listed 'Best First'.
Re: Improving sorts using ST by Abigail-II (Bishop) on May 19, 2004 at 19:55 UTC
That's not really a complicated sort. Here's one way of doing it using ST: `@sorted_array = map {$_ -> [0]} sort {$a -> [1] <=> $b -> [1]} map {[$_ => (split /$delim/ => $_) [$key]]} @$array_2_ +sort_ref;` [download] For the non-numeric sort, replace `<=>` with `cmp`. You are mistaken however that ST allows more complex sorting. There's nothing ST can sort that you can't sort with a non-ST method. ST can be more efficient though. However, a GRT often beats an ST when it comes to efficiency. Abigail	[reply] [d/l]
Re: Re: Improving sorts using ST by drewbie (Chaplain) on May 20, 2004 at 17:45 UTC
Uri Guttman has written Sort::Maker (not yet on CPAN) which will help build the different types of sorts for those who want to build them on the fly. He just gave a talk about it at the boston.pm meeting Tuesday and IIRC he's going to be talking about it at YAPC::NA this year. The talk (and some of the module POD also) focuses a bit on when and why you would use the different types of sorts.	[reply]
Re: Re: Re: Improving sorts using ST by sweetblood (Prior) on May 20, 2004 at 19:33 UTC
This should be very interesting once available. Thanks! Sweetblood	[reply]
Re: Re: Improving sorts using ST by sweetblood (Prior) on May 20, 2004 at 14:57 UTC
Hmmm, that gives me more to think about.(I like that) Still, I'm having trouble understanding how to do this sort on mutltiple fields. Thanks agian! Sweetblood	[reply]