My approach is to filter records while reading data.
use warnings;
use strict;
my $format = "%-11s %-8s\n";
open my $data, '<', 'data.txt'
or die "Could not open 'data.txt': $!\n";
my %dataset;
RECORD:
while ( my $line = <$data> ) {
chomp $line;
my ( $coord, $dist ) = split ',', $line;
next RECORD
if exists $dataset{$coord} and $dataset{$coord} < $dist;
$dataset{$coord} = $dist;
}
printf $format, 'coordinate', 'distance';
printf $format, $_, $dataset{$_}
for sort { $a <=> $b } keys %dataset;
- I turn on warnings and strict as soon as my program grows beyond a few lines, because I don't need to spend 2 hours investigating something strange which turns out to be a typo.
- I'm not sure how specific your format needs to be, so I assumed it was hard-coded column widths and used printf with a format string , rather than print or say.
- Read in lines from a comma-separated file, discard the newline and split on comma into temporary variables.
- If the distance is not the shortest on record for this coordinate, jump to the next input record. Although you can use next or last or redo without a label, I like to always specify what I'm next-ing or last-ing or redo-ing, especially if the loop is long or if it's a nested loop. Save the data if we haven't bailed.
- Extract the keys from the hash, sort them numerically, and print out the keys and values.
- I never actually close input files ... this case is too short and fast to care, and in more complicated instances I would have a readfile() routine, which would automatically close the file when it went out of scope.
- You can get more information on next, last, sort and other stuff by typing perldoc -f sort or whatever the command name may be, into an xterm window, or you can use the perldoc web site
As Occam said: Entia non sunt multiplicanda praeter necessitatem.