sort needs both time and space to perform the sort no matter how cleverly implemented. I find it hard to imagine a system that is so poorly configured that it can't handle sorting a paultry 500kB file. But I don't think that really matters in this particular case.
There is a reason that "sort -u" came to be. It is much slower to sort all 57000 instances of several IPs and then throw all but one of each away. So I think "sort | uniq -c" would be much slower than using Perl.
Unfortunately, it doesn't appear that even GNU sort has bothered to implement a -u option that counts the duplicates.
- tye
|