in reply to Need help figuring out how to order/eval the numbers
Hello perlynewby,
Since the input data will “later be scrambled,” there doesn’t seem to be much point in pursuing the strategy of Experiment 1. I have therefore looked only at Experiment 2, using a hash for lookup. The resulting code is far from perfect, but it produces the desired output:
#! perl #################################################################### # Key 0 1 2 3 4 # Italian => Spanish, French, German, English|German, English|German #################################################################### use strict; use warnings; use autodie; use constant NAMES => qw(uno due tre quattro cinque sei sette otto nou +ve dieci undici dodici tredici); my %hash; open my $in, '<', './Test_Data_RandNumbers.txt'; while (<$in>) { my ($ita, $spa, $ger) = split /[=\s,]+/; if ($ita) { $hash{$ita}[0] = $spa; $hash{$ita}[2] = $ger if $ger; } } close $in; open my $in1, '<', './Test_Data_More_RandNumbers.txt'; while (<$in1>) { # $num1 & $num2 may each be either English or German my ($ita, $fren, $num1, $num2) = split /[=\s,]+/; $hash{$ita}[1] = $fren; if ($num1) { $hash{$ita}[3] = $num1; $hash{$ita}[4] = $num2 if $num2; } } close $in1; open my $out , '>', './OUT_Test_Data_Ita_SpanFren_rest.txt'; open my $out1, '>', './OUT_Test_data_NO_match_SpanFren.txt'; for my $ita (sort { sort_italian() } keys %hash) { if ($ita) { my $fh = defined $hash{$ita}[0] && defined $hash{$ita}[1] ? $out : $out1; print $fh "$ita => ", join(',', map { $_ // () } @{ $hash{$ita} }), "\n"; } } close $out; close $out1; { my %numbers; BEGIN { my $i = 1; %numbers = map { $_ => $i++ } NAMES; } sub sort_italian { # Add error checking here! return $numbers{$a} <=> $numbers{$b}; } }
Some notes:
I have moved the file open and close statements to reflect the actual usage of the filehandles. It is generally a good idea to minimise a variable’s effective scope.
I have added tests such as:
if ($ita)
to exclude undefs and null strings from the hash. These tests should really be more explicit:
if (defined $ita && $ita ne '')
but the shortcut is OK since neither 0 nor '0' is going to occur in the data as a valid value.
In the line
join(',', map { $_ // () } @{ $hash{$ita} }),
the map removes undef fields which otherwise produce warnings such as this:
Use of uninitialized value in join or string at ... line ...
The hash keys are sorted using a special-purpose sub, which in turn uses a hash that maps each Italian number name to its corresponding numerical value. See sort for other syntax options.
The two chomps are not needed as each is effectively performed by the immediately-following split on whitespace (\s).
Hope that helps,
Updates: Made minor improvements to code and text; added final note.
| Athanasius <°(((>< contra mundum | Iustus alius egestas vitae, eros Piratica, |
|
|---|