in reply to In-place sort with order assignment
I don't see a really perlish way to do this, so my approach would be to repeatedly sort the data using heaps, first taking the bottom n elements, then the next n elements. This can be done relatively efficiently using heaps, except that you have to sort the data keys %hash / n times. I think you can optimize the second passes onwards by caching the offset items at positions n, 2*n and so on, so you get to discard items that are "too high" earlier on. I'm not sure whether not inserting items that will get discarded anyway will save you much time.
Alternatively, you could look at whether your external sort program can handle the data size, then I'd write out the keys using each and read them back in using while (<$fh>) {.
Which tradeoff is faster will really depend, because with the first approach, you're basically sorting your data multiple times, while with the second approach, your external sort program needs a strategy to sort the data.
|
|---|