in reply to Slow at sorting?

My test logfile was 100MB, my first problem was memory, loading the whole log into memory eventually sucked all my resources (382 MB of ram)

Whoa! Even after you've made one pass to build your sort key, chances are quite good that you're falling prey to virtual memory thrashing. If that's the case, then the C++ code might well run faster on the 100MB sample (depending on how you code, C++ is going to have a lower memory footprint), but at some point you're going to throw more data at the C++ code, and it'll fall over, too. Do you have any visibility onto virtual memory behavior (e.g., can you get a running page-fault count?)

As I see it, you have a few choices if you want to stay with Perl:

  1. Add more physical memory (which these days is an inexpensive proposition), or
  2. Break the file into smaller pieces, sort the pieces, then merge them later, or
  3. Do both
Depending on your production dataset size, you may have the same options with C++.