I hope you don't mind deviating from the "one-liner" requirement. You mentioned you're doing this to learn something along the way, so I wanted to make another suggestion, to that end.
Your current implementation must sort the entire city list for each country, just to retrieve the top four items. When you need the top-n of anything, in sorted order, it's rather unfortunate that the simplest approach is usually to sort the entire list. What you could get away with using is a "partial sort"; one that partitions the input into two parts: a part you want, and a part you don't want. ...and then sorts and returns just the part you want.
It turns out there's a module on CPAN that does this. It's called, Sort::Key::Top. Its interface is a little complicated to learn at first, but once you do, it works fairly well. Here is an example:
use Sort::Key::Top 'rnkeytopsort'; my %countries; while( <DATA> ) { my( $country ) = m/:([^:]{2}):/; push @{$countries{$country}}, $_; } print map { rnkeytopsort { /^(\d+):/; $1; } 4 => @{$countries{$_}} } keys %countries; __DATA__ 20470:ZM:Samfya:Africa 20149:ZM:Sesheke:Africa 18638:ZM:Siavonga:Africa 26459:ZW:Beitbridge:Africa 37423:ZW:Bindura:Africa 699385:ZW:Bulawayo:Africa 47294:ZW:Chegutu:Africa 61739:ZW:Chinhoyi:Africa 18860:ZW:Chipinge:Africa 28205:ZW:Chiredzi:Africa
The way this works is it takes your original data set, and divides it into smaller sets, each set representing a country. Then it does a "top-n" partial sort within each country, and prints out the result.
I first went looking for a module like this one awhile ago, after using C++'s std::partition and std::partial_sort algorithms in a C++ project I was working on at the time. The concepts are pretty simple, but sometimes it takes seeing them in use somewhere else (in this case in a different language) to "discover" their usefulness.
Update:
After preaching about the wasted cycles caused by sorting the entire list of cities just to pick the top four, I went ahead and implemented a version that does just that. Why? It was one of those times where after walking away from the keyboard an idea came along that seemed like it would be fun to explore. Here it is:
print do { my($c,$n) = ('',0); map { $_->[0] } grep { ($c,$n) = ($_->[2],0) if $_->[2] ne $c; $n++ < 4 } sort { $a->[2] cmp $b->[2] || $b->[1] <=> $a->[1] } map { [ $_, /^(\d+):([^:]{2}):/ ] } <DATA>; }; __DATA__ 20470:ZM:Samfya:Africa 20149:ZM:Sesheke:Africa 18638:ZM:Siavonga:Africa 26459:ZW:Beitbridge:Africa 37423:ZW:Bindura:Africa 699385:ZW:Bulawayo:Africa 47294:ZW:Chegutu:Africa 61739:ZW:Chinhoyi:Africa 18860:ZW:Chipinge:Africa 28205:ZW:Chiredzi:Africa
Read this one from the bottom up:
It seemed like a cool approach to me, even if it gives back a little efficiency by sorting the entire list. I would probably favor the partition/partial sort strategy posted at the top of my answer though; it's fairly clear what it does, and should be efficient.
Dave
In reply to Re: Using map function to print few elements of list returned by sort function
by davido
in thread Using map function to print few elements of list returned by sort function
by jaypal
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |