The code would need more work before it could go into List::Util
Agreed, but I did have a go at making topN() accept a comparator function, but the cost off calling back into Perl code for each comparison exacts a heavy performance penalty--to the point where it looses all benefit. Though it is possible that my fledgling XS/internals skills bore some of that cost.
The only way I see to avoid that, would be to have four separate routines, (say) highNn(); highNs(); lowNn(); lowNs();.
I think this is the route taken by sort, albeit that the "naming convention" (for backwards compatibiity?) is sort{$a<=>$b}(); thru sort{$bcmp$a}();
Since the basic algorithm needs only a size-N array, there ought to be some way to wrap this as a package that takes an arbitrary input set of unlimited size,...
Personally, I'm not very keen on that type of pseudo-OO wrapper around what is essentially a functional algorithm. Would the wrapper code accumulate a copy of everything passed or discard values that didn't make the cut?
If the former, unnecessary duplication of storage results. If the latter, I think I would prefer not to do a function call to reject individual values that don't make the cut. I'd code the accumlator scenario something like:
my @topN = topN( $n, \@data ); @topN = topN( $n, [ @topN, grep{ chomp; $_ > $topN[ -1 ] } <> ] );
(BTW. I really wish that chomp returned the chomped value. It would make some constructs so much easier.)
or maybe
my @topN( $n, \@data ); my @batch; while( not eof ARGV ) { for( my $i=0; $_ = <> and $i < $BATCHSIZE; $i++ ) { chomp; push @batch, $_; } @topN( $n, [ @topN, @batch ] ); }
Basically, I prefer to retain programmer choice in stuff like this--but there would be nothing to stop someone adding a wrapper around it, though four (type/direction) versions would probably be needed again.
In reply to Re^5: Better mousetrap (getting top N values from list X)
by BrowserUk
in thread Better mousetrap (getting top N values from list X)
by Limbic~Region
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |