how to do effective sort in perl

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: how to do effective sort in perl by Fletch (Bishop) on May 03, 2006 at 13:28 UTC
Well, aside from calling `my @filenames = sort readdir( DIR )` in one step (since by default sort uses effectively the comparitor you've shown, and that copies the results around one less time) what more do you think could be optimized? Doesn't get much more bare bones than that. If you're not interested in every filename that `readdir` is going to return you might stick a grep in between there, but there's not really much more to be pared down.	[reply] [d/l] [select]
Re: how to do effective sort in perl by Polonius (Friar) on May 03, 2006 at 15:15 UTC
If it's an efficient sort algorithm you're after, you should note that "In 5.7, the quicksort implementation was replaced with a stable mergesort algorithm whose worst-case behavior is O(NlogN)." (perldoc). According to Numerical Recipes, "For the basic task of sorting N elements, the best algorithms require on the order of several times N log₂ N operations." That means that mergesort is just about as efficient as you can get. Incidentally, Numerical Recipes is an astonishingly good book. Knuth may be the definitive reference, and nag may have many algorithms fine-tuned to specific cases, but NR is exceptionally good value for money. You can even download free source code in FORTRAN or C. If you're new to Perl, I'd see the lack of a Perl version as a good thing. I think it's a useful exercise in learning a language to try translating into it. Update: Of course, all of this is overkill for the number of items likely to be found in any sensible directory! Polonius	[reply]
Re^2: how to do effective sort in perl by graff (Chancellor) on May 04, 2006 at 02:35 UTC
Of course, all of this is overkill for the number of items likely to be found in any sensible directory! heh. I've actually had to scold some of my co-workers, telling them they really should not be putting 300,000 files in a single directory, and there are easy enough ways to avoid this... Even 50,000 seems like too much to me (but then, I learned programming on a pdp 11-10, and my sense of scale may be a bit old-fashioned).	[reply]
Re^3: how to do effective sort in perl by Polonius (Friar) on May 04, 2006 at 14:57 UTC
heh. I've actually had to scold some of my co-workers, telling them they really should not be putting 300,000 files in a single directory, and there are easy enough ways to avoid this... Anything more than a screenful seems excessive to me! But then, I grew up on a Data General S-250, IIRC (Think PDP 11-70 - that sorta era, but less successful.) Polonius	[reply]
OT: Re^3: how to do effective sort in perl by MidLifeXis (Monsignor) on May 04, 2006 at 17:37 UTC
Been a while since I have referred anyone to this, but it is a pretty good introductory discussion <shamelessplug>(and response)</shamelessplug> on why some directory structures slow down after a certain number of documents are included. Not necessarily useful for those Old System Administrators¹. 1 - OSA's never die, they just put down roots. --MidLifeXis	[reply]
Re: how to do effective sort in perl by BrowserUk (Patriarch) on May 03, 2006 at 14:51 UTC
On my OS I'd do `my @files = glob '*';` [download] as glob always produces a sorted list on my system. I'm not sure if that is true on other systems? Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal? "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l]
Re^2: how to do effective sort in perl by Hue-Bond (Priest) on May 03, 2006 at 15:06 UTC
On some OSs that doesn't include hidden files: `$ perl -le 'opendir my $fd, "."; my @files = readdir $fd; closedir $fd +; print scalar @files' 48 $ perl -le 'my @files = <>; print scalar @files' 7 $ _` [download] But including them is not difficult: `$ perl -le 'my @files = (<>,<.>); print scalar @files' 48 $ _` [download] It seems that readdir is faster, though: `use Benchmark qw(:all) ; cmpthese(20000, { readdir => sub { opendir my $fd, "."; my @files = readdir $fd; close +dir $fd; }, glob => sub { my @files = (<>,<.>); } }); __END__ Rate glob readdir glob 3170/s -- -75% readdir 12739/s 302% --` [download] Update:* Of course glob is slower, since in this computer it also sorts the result. -- David Serrano	[reply] [d/l] [select]
Re: how to do effective sort in perl by blazar (Canon) on May 03, 2006 at 13:37 UTC
sort with no explicit comparison is already optimized away and does the same as what you've shown, so just remove the block. Oh, and BTW: you probably already knew, but "premature optimization is the root of all evil".	[reply]
Re: how to do effective sort in perl by duff (Parson) on May 03, 2006 at 14:44 UTC
You didn't say what you wanted to optimize, so here's another take on it: `my @filenames = do { opendir my $d, "." or die; sort readdir $d };` [download] Note that this only works on modern perls. duff	[reply] [d/l]
Re: how to do effective sort in perl by QM (Parson) on May 03, 2006 at 19:35 UTC
As other have alluded, there are several paths to optimization. Do you want it to run faster? Do you want to use less code? Do you want it to be more readable, maintainable, portable, or more robust? `glob` or similar will be less code. What you've given (other than the lack of codetags) is almost as good as you can do for balancing all of the criteria above. Incidentally, IIRC Perl has builtin optimizations for the following sort subs: `{$a cmp $b} # default {$b cmp $a} {$a <=> $b} {$b <=> $a}` [download] These are the most common. So using any one of these should be indistinguishable from the default (no sort sub) case. -QM -- Quantum Mechanics: The dreams stuff is made of	[reply] [d/l] [select]
Re: how to do effective sort in perl by smokemachine (Hermit) on May 03, 2006 at 20:59 UTC
perl -e '@filenames = sort <*>'	[reply]