Re: sort direction
by Tanktalus (Canon) on Oct 07, 2005 at 15:10 UTC
|
Interestingly, no one suggested using different sort functions.
sub mysort_asc {
$hash{$a}{$orderby} <=> $hash{$b}{$orderby}
||
$hash{$a}{$orderby} cmp $hash{$b}{$orderby}
}
sub mysort_desc {
$hash{$b}{$orderby} <=> $hash{$a}{$orderby}
||
$hash{$b}{$orderby} cmp $hash{$a}{$orderby}
}
my $sort_func = $direction eq 'DESC' ? \&mysort_desc : \&mysort_asc;
foreach my $id (sort { &$sort_func } keys %hash)
{
#...
}
(untested) | [reply] [d/l] |
|
|
How about dispatching the direction to get the appropriate sort function?
my %sorts = (
DESC => sub {
$hash{$b}{$orderby} <=> $hash{$a}{$orderby}
or
$hash{$b}{$orderby} cmp $hash{$a}{$orderby}
},
ASC => sub {
$hash{$a}{$orderby} <=> $hash{$b}{$orderby}
or
$hash{$a}{$orderby} cmp $hash{$b}{$orderby}
}
);
my $sort_func = $sorts{$direction};
foreach my $id (sort { &$sort_func } keys %hash) {
Update: do the hash lookup only once, not for every comparison.
Caution: Contents may have been coded under pressure.
| [reply] [d/l] |
Re: sort direction
by Perl Mouse (Chaplain) on Oct 07, 2005 at 13:01 UTC
|
my @sorted_keys =
sort {$hash{$a}{$orderby} <=> $hash{$b}{$orderby} ||
$hash{$a}{$orderby} cmp $hash{$b}{$orderby}
keys %hash;
@sorted_keys = reverse @sorted_keys if $direction eq "DESC";
foreach my $id (@sorted_keys) {
print "$id - $hash{$id}{$orderby}\n";
}
Although I vastly prefer to keep a mapping of which keys are numbers and which aren't, and don't do two tests for each non-numeric pair of keys you encounter. (Not to mention that your non-numeric entries happen to have strings that start with a number, it's going to treat them as numbers). And if I didn't I'd use a GRT to avoid the sort block. But I disgress.
| [reply] [d/l] |
|
|
The problem I have with your solution isn't so much the "cleanness", as well its inefficiency. You will be checking for direction for every comparison made by the sort - instead of checking it once.
In either case, you still have an O(n log n) sort. If you are really concerned about the difference in efficiency here, you might be better off programming in C or assembly. With perl, it's not even so easy to determine the efficiency with that level of granularity. For instance, how does your reverse() compare to the additional branches in the OP's code?
It seemed the OP was hoping to make his code cleaner and therefore easier to read and maintain. In terms of efficiency, he could be saving minutes to hours of programmer time over the lifetime of this program compared to the mere seconds of machine time your optimization may save. And when you compare costs, the difference is likely to be even more significant because developer time generally costs a lot more than machine time.
None of this is to say the solution you suggested is a bad one. It isn't. Regardless of efficiency questions, it is clearer. If he's going to use this same sort more often than once, though, sticking it in a sub might be a good idea (in which case, of course, it should be given a better name than the OP's original mysort().) Two separate subs for ascending and descending might be a good idea in that case too.
-sauoq
"My two cents aren't worth a dime.";
| [reply] [d/l] [select] |
|
|
And when you compare costs, the difference is likely to be even more significant because developer time generally costs a lot more than machine time.
That's a red herring. Machine time is utterly unimportant. But what is important is the end user. Would you accept it that the movie you're watching on your PC freezes because developer time costs more than machine time? In general, it's very hard to say anything about developer time vs. running time. Many programs aren't designed to be run once, they will be run thousands, or millions of times.
In this specific case, lifting the descending/ascending switch outside of the sort (be it by using reverse, or two comparison functions) doesn't cost anything. It's the same amount of code, making the same decisions. I've a hard time imagining a coder that would take hours more time to do the same amount of things.
A good coder will always program with efficiency in mind - doing thinks like this (lifting code that will yield the same result out of a loop) should be automatic. And don't come with the mantra premature optimization is the root of all evil, because we're not sacrificing anything. The code doesn't become less flexible, harder to read or harder to maintain. It doesn't trade memory or some other resource for speed either.
| [reply] |
|
|
Re: sort direction
by chester (Hermit) on Oct 07, 2005 at 14:00 UTC
|
You might look at Sort::Key and friends, which can sort numerically or string...ically, and also in reverse. I agree with [id://Perl Mouse] about the dangers of mixing numeric and string sort, unless you're really sure of your data. | [reply] |
Re: sort direction
by Roy Johnson (Monsignor) on Oct 07, 2005 at 13:40 UTC
|
I think you're looking for something like:
sub mysort {
my ($direction) = @_;
my ($one, $two) = @hash{($direction eq 'DESC') ? ($b, $a) : ($
+a, $b)};
$one->{$orderby} <=> $two->{$orderby}
or
$one->{$orderby} cmp $two->{$orderby}
} # end-sub
Caution: Contents may have been coded under pressure.
| [reply] [d/l] |
Re: sort direction
by davidrw (Prior) on Oct 07, 2005 at 14:06 UTC
|
I sort of combined Perl Mouse's solution with OP's original attempt where a sort was directly used but eliminated the wrapper sub:
my %hash = ( ... );
my $orderby = "age" # or name, dob, etc
my $direction = "ASC" # or DESC
foreach my $id (
map
{ $direction eq "DESC" ? reverse @$_ : @$_ }
[
sort {
$hash{$a}{$orderby} <=> $hash{$b}{$orderby}
||
$hash{$a}{$orderby} cmp $hash{$b}{$orderby}
} keys %hash
]
) {
print "$id - $hash{$id}{$orderby}\n";
} # end-foreach
__END__
# my test script:
perl -le '@x=(1,4,5,2,3,9,6,8,7); $direction="DESC"; print for map { $
+direction eq "DESC" ? reverse @$_ : @$_ } [sort @x] '
| [reply] [d/l] [select] |
Re: sort direction
by ikegami (Patriarch) on Oct 07, 2005 at 15:53 UTC
|
That sort routine doesn't work too well when sorting string fields. When sorting string fields, number are considered greater than strings. In ASCII, it's the other way around.
>perl -le "print '123' <=> 'abc'"
1
>perl -le "print '123' cmp 'abc'"
-1
It would be better if you knew if the field was numerical or a string in advance. The following makes all decision making in advance, speeding up the sorting.
my $orderby = 'age';
my $numerical = 1;
my $direction = 'ASC';
my $sorter;
if ($direction eq 'ASC') {
if ($numerical) {
$sorter = sub { $hash{$a}{$orderby} <=> $hash{$b}{$orderby} };
} else {
$sorter = sub { $hash{$a}{$orderby} cmp $hash{$b}{$orderby} };
}
} else {
if ($numerical) {
$sorter = sub { $hash{$b}{$orderby} <=> $hash{$a}{$orderby} };
} else {
$sorter = sub { $hash{$b}{$orderby} cmp $hash{$a}{$orderby} };
}
}
@sorted = sort { &$sorter } keys %hash;
| [reply] [d/l] [select] |
Re: sort direction
by Roy Johnson (Monsignor) on Oct 07, 2005 at 15:42 UTC
|
It seems to me that you might be able to do a little glob magic. And it's really more of a clever-for-fun idea than a recommended coding technique. I can't test it, so it probably needs some adjustments, but the idea should be pretty clear.
sub mysort {
$hash{$a}{$orderby} <=> $hash{$b}{$orderby}
or
$hash{$a}{$orderby} cmp $hash{$b}{$orderby}
}
{
local (*a, *b) = ($direction eq 'DESC') ? (\*b, \*a) : (\*a, \*b);
foreach my $id (sort mysort keys %hash) {
print "$id - $hash{$id}{$orderby}\n";
}
}
Caution: Contents may have been coded under pressure.
| [reply] [d/l] |
Re: sort direction
by radiantmatrix (Parson) on Oct 07, 2005 at 16:17 UTC
|
Accoring to the documentation for sort, it expects a function that returns -1, 0, or 1 for (a<b),(a==b),(a>b), respectively. So, that's all your sub needs to do.
# note passing your sub $a and $b
foreach my $id (sort { mysort($direction,$a,$b) } keys %hash) {
print "$id - $hash{$id}{$orderby}\n";
} # end-foreach
sub mysort {
my ($direction, $_a, $_b) = @_;
my $result =
$hash{$_a}{$orderby} <=> $hash{$_b}{$orderby}
|| $hash{$_a}{$orderby} cmp $hash{$_b}{$orderby};
# default to ascending, but if the dir is 'DESC', we
# return the *opposite* answer
return ($direction eq 'DESC') ? -($result) : $result;
}
Untested, of course, but the idea should be basically sound. If I understand what you're trying to accomplish, though, you might take a different approach.
# we pass the sub the actual values to compare
foreach my $id (
sort { mysort($direction,$hash{$a}{$orderby},$hash{$b}{$orderby}) }
+keys %hash
) {
print "$id - $hash{$id}{$orderby}\n";
} # end-foreach
sub mysort {
my ($direction, $_a, $_b) = @_;
# generalized, simplified comparison
my $result = $_a <=> $_b || $_a cmp $_b;
# default to ascending, but if the dir is 'DESC', we
# return the *opposite* answer
return ($direction eq 'DESC') ? -($result) : $result;
}
| [reply] [d/l] [select] |
|
|
Accoring to the documentation for sort, it expects a function that returns -1, 0, or 1 for (a<b),(a==b),(a>b), respectively. So, that's all your sub needs to do.
Actually, according to your link:
SUBNAME ... gives the name of a subroutine that returns an integer less than, equal to, or greater than 0
which is rather less stringent than requiring -1, 0 or 1.
Roger
| [reply] |
|
|
You are correct, if pedantic. :-) Just out of curiosity, under what circumstances would it be easier or faster to return something other than -1,0,1 ?
| [reply] |
|
|
|
|
|
Re: sort direction
by Roger_B (Scribe) on Oct 07, 2005 at 16:20 UTC
|
sub mysort {
my ($direction) = @_;
( $hash{$a}{$orderby} <=> $hash{$b}{$orderby}
||
$hash{$a}{$orderby} cmp $hash{$b}{$orderby})
*
($direction eq "DESC" ? -1 : 1);
}
This minimises duplication and cleans up the source code.
But it still means evaluating the direction each time. All those wasted cycles!
How about a closure?
my $orderby = "age"; # or name, dob, etc
my $direction = "ASC"; # or DESC
my $sortfunc = getsort($direction);
foreach my $id (sort $sortfunc keys %hash) {
print "$id - $hash{$id}{$orderby}\n";
} # end-foreach
sub getsort {
my ($direction) = @_;
my $sense = $direction eq "DESC" ? -1 : 1;
return sub {
( $hash{$a}{$orderby} <=> $hash{$b}{$orderby}
||
$hash{$a}{$orderby} cmp $hash{$b}{$orderby})
*
$sense;
}
}
This keeps the code clean and keeps the operations to be done in each comparison close to the minimum, without using reverse.
Roger | [reply] [d/l] [select] |
|
|
This keeps the code clean and keeps the operations to be done in each comparison close to the minimum, without using reverse.
But you're still doing a multiplication for on each comparison. Why avoid using reverse?
| [reply] |
Re: sort direction
by ph713 (Pilgrim) on Oct 07, 2005 at 13:09 UTC
|
#!/usr/bin/perl -w
use Carp;
my %people = (
Bob => { age => 60, town => 'Harrisburg' },
Joe => { age => 55, town => 'Calgary' },
Sue => { age => 40, town => 'Houston' },
Jim => { age => 42, town => 'Dresden' },
Ann => { age => 47, town => 'Los Angeles' },
);
my %data_types = (
age => 'numeric',
town => 'text',
# etc...
);
sub sort_people_by {
my ($people,$sortby,$dir) = @_;
my $dtype = $data_types{$sortby};
if(!$dtype) {
croak "No data type defined for $sortby";
}
elsif($dtype eq 'numeric') {
if($dir eq 'ASC') {
return sort { $people->{$a}->{$sortby}
<=> $people->{$b}->{$sortby} }
keys %$people;
}
else {
return sort { $people->{$b}->{$sortby}
<=> $people->{$a}->{$sortby} }
keys %$people;
}
}
elsif($dtype eq 'text') {
if($dir eq 'ASC') {
return sort { $people->{$a}->{$sortby}
cmp $people->{$b}->{$sortby} }
keys %$people;
}
else {
return sort { $people->{$b}->{$sortby}
cmp $people->{$a}->{$sortby} }
keys %$people;
}
}
else {
croak "Data type $dtype unknown";
}
}
sub print_people_sorted_by {
my ($people,$sortby,$dir) = @_;
foreach my $person
(sort_people_by($people,$sortby,$dir)) {
print "$person - $people{$person}{$sortby}\n";
}
}
print_people_sorted_by(\%people,'age','ASC');
print_people_sorted_by(\%people,'town','DESC');
Updated style to be closer to PBP, since I need the practice anyways - old habits die hard :) | [reply] [d/l] |
Re: sort direction
by BrowserUk (Patriarch) on Oct 07, 2005 at 20:24 UTC
|
I don't think this has anything that isn't in one or other of the other posts, but avoiding two levels of dereference four times inside the sort block, should more than compensate for the simplicity of using reverse, to avoid having two of them.
I don't have much of a problem with your generic sort block. If the fields are consistantly numeric or non-numeric, it will do the right thing, and if they are not, it probably means that someone entered bad data and so long as the bad stuff is sorted to one end or the other it does much matter which.
The one time it could give you a surprise if your DOB contained stuff like "12th Jan 1901" and "12th december 2004", (ie. non-numeric strings that start with numbers), which would be grouped together by day, but otherwise be in a random ordering. Then again, sorting dates is pain unless they are in a fully numeric format (eg.YYYYMMDD) or similar. One thing that did strike me as odd was you maintaining both a DOB and AGE fields? If so, you create a consistancy problem with non-normailsed data.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.
| [reply] [d/l] |
Re: sort direction
by rir (Vicar) on Oct 08, 2005 at 03:56 UTC
|
I find problems with your code that have not been commented on. Your specifically the direction part is a red herring that responders seemed to chase.
The problems with your mysort are that it is
single purpose and it only sorts
the %This_package::hash and nothing else. It is
bad that one must read mysort to know that. The function is also poorly named and is defined at the wrong level. Defined in a logical, or problem domain modelling, sense.
If you can, use the standard
style of sorting. This is to use the $a and
$b variables to indicate how you are sorting.
If you want a reversed sort you use $b $a to
show that. This is to define the sort at a lower
and more generic level.
I would suggest you use something like:
sub num_or_str {
no warnings 'numeric';
$_[0] <=> $_[1] || $_[0] cmp $_[1]
}
my @sorted
= sort num_or_str(
$people{$a}->{age},
$people{$b}->{age}
), keys %people;
Then you can reverse the order of the arguments to num_or_str to reverse the sort. I consider
that idiomatic Perl as below.
If you really need a direction flag variable, you could
keep it readable:
my @stuff = qw/ 9 -1 -3 hi 0 bye 5 10 /;
my $descending = 1;
my @ar;
if ( $descending ) {
@ar = sort { num_or_str($b, $a) } @stuff;
}
else {
@ar = sort { num_or_str($a, $b) } @stuff;
}
If you want to structure code more like your sample you
should put the function around the sort. That is what
I meant about creating the function at the wrong level.
my $sort_by = 'age';
my $descending = 1;
my $database = \%people;
# return an ordered list of copies of keys of a
# hash structured as described somewhere
# usage: sort_people( \%sort_me, $by_me, $descending )
# Ascending sort is the default.
# If $by_me is not a key that exists then ...
# Maybe an arref should be returned.
sub sort_people {
my ( $people, $by, $desc) = @_;
no warnings qw/ numeric uninitialized/;
if ( $desc) {
return sort { your_conditions( $b, $a)} keys %$people;
} else {
return sort { your_conditions( $a, $b)} keys %$people;
}
}
Then this higher level code is easier to understand:
my @sorted_keys = sort_people( \%people, $sort_on, $descending);
Be well, rir
| [reply] [d/l] [select] |
Re: sort direction
by Aristotle (Chancellor) on Oct 08, 2005 at 23:22 UTC
|
for my $id ( sort_hoh \%hash, age => 'ASC' ) {
# ...
}
sub sort_hoh {
my ( $hoh, $key, $dir ) = @_;
my @dir = $dir eq 'DESC' ? ( 1, 0 ) : ( 0, 1 );
sort {
my ( $_a, $_b ) = map $_->{ $key }, @{ $hoh }{ ( $a, $b )[ @di
+r ] };
$_a <=> $_b || $_a cmp $_b;
} keys %$hoh;
}
Makeshifts last the longest. | [reply] [d/l] |
Re: sort direction
by ysth (Canon) on Oct 09, 2005 at 19:31 UTC
|
It's worth mentioning that reversing the result of the sort isn't quite the same as having the sort function return the opposite. For instance:
$ perl -e'print sort { $b <=> $a } qw/0a 0b/'
0a0b
$ perl -e'print reverse sort { $a <=> $b } qw/0a 0b/'
0b0a
| [reply] [d/l] |
|
|
P:\test>perl -e"print sort { $b <=> $a || $b cmp $a } qw/0a 0b 1a 1b/"
1b1a0b0a
P:\test>perl -e"print reverse sort { $a <=> $b || $a cmp $b } qw/0a 0b
+ 1a 1b/"
1b1a0b0a
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.
| [reply] [d/l] |
|
|
I was making a general point. With the sort function as you give it, there's still
a difference, as shown by:
$ perl -wle'use overload q!""!=>sub{0},fallback=>1; $x=bless[0]; $y=bl
+ess[1];
print map @$_, sort { $b <=> $a or $b cmp $a } $x, $y;
print map @$_, reverse sort { $a <=> $b or $a cmp $b } $x, $y'
01
10
| [reply] [d/l] |
|
|
|
|
|
|
|
Considering that the OP is sorting hash keys, stability is not relevant.
| [reply] |