perlynewby has asked for the wisdom of the Perl Monks concerning the following question:

need help figuring this out, I need to concat 2 numbers but I think my method is not correct ;-). shocking, I know!

#define if there (is) are numbers in position 3 and 4 concat to pos +ition 3 if (defined ($num1) and defined ($num2)){ $hash{$ita}[3]=$num1$num2; # concat if numbers defined in poition + 1 and 2 and store in 3

also, please note that italian numbers not found in both files. their output will show something like this

uno => uno, ,eins

where french is empty, how to better work this?

will hash of hashes fix the output? better method? please advise as this is my 2nd week of perl coding. 2nd week of coding anything period. still learning alot by your advice.

DATA numbers in Spanish and may be German

uno = uno,eins due = dos,zwei tre = tres,drei quattro = quatro cinque = cinco,funf sei = seis, sette = siete,sechs otto = ocho nouve = nueve, neun dieci = diez, zehn undici = once, elf dodici = doce tredici = trece, dreizehn

DATA: numbers in french and maybe German or English or both

due =deux, two, tre = trois,drei, three quattro = quatre,four cinque = cinq,funf, five sei = six , six sette = sept , seven ,sechs dieci =dix undici = onze,eleven tredici = treize,thirteen, dreizehn
use strict; use diagnostics; use warnings; use autodie qw(open close); use Data::Dump qw(dump); #declare variables my %hash; my $data; #opening Files using autodie to cut typing... open my $in, '<',"./Test_Data_RandNumbers.txt"; open my $in1,'<',"./Test_Data_More_RandNumbers.txt"; open my $out ,'>' ,"./OUT_Test_Data_Ita_SpanFren_rest.txt"; open my $out1,'>',"./OUT_Test_data_NO_match_SpanFren.txt"; #open my $out2,'>' , "./Test_Data_Out_None_Match.txt"; while (<$in>){ #data manipulation to clean up ='s and ,'s #dieci = diez, zehn -->worse case, remove spaces and = and comm +a; #quattro = quatro -->only one number with spaces or not in from of + =... chomp; my ($ita,$spa,$num)= split(/[=\s,]+/); # removes '=' or 's' or ',' + & '+' to match 1 or more these characters $hash{$ita}[0]=$spa; #what about if there is no $num at position 1 in 1st file? if (defined $num){ $hash{$ita}[2]=$num; # if defined then keep it for check l +ater in position 2 in array } } close $in; while (<$in1>){ chomp; my ($ita,$fren,$num1,$num2)= split(/[=\s,]+/); #creates col of num +bers $hash{$ita}[1]=$fren; #now hashs format will look like this: ita=> + spa fren #define if there is are numbers in position 3 and 4 concat to posi +tion 3 if (defined ($num1) and defined ($num2)){ $hash{$ita}[3]=$num1$num2; # concat if numbers defined in poition + 1 and 2 and store in 3 } elsif(defined $num1) { $hash{$ita}[3]=$num1; #if array has num in pos 2 then sa +ve number in position 3(dieci) } } close $in1; foreach my $ita (keys %hash){ if($hash{$ita}[0] and $hash{$ita}[1]){ print $out "$ita =>", join(',',@{$hash{$ita}}),"\n"; }else { print $out1 "$ita =>",join(',',@{$hash{$ita}}),"\n"; } } #close $out; #close $out1;

Replies are listed 'Best First'.
Re: concatenation 2 values then pushing to a hash_ref
by Aldebaran (Curate) on Jun 12, 2015 at 01:27 UTC

    Hello perlnewby and welcome to the monastery

    I made some guesses at what I think you're intending to do and am by no means the most-sophisticated coder around here, but I think I'll be able to help you along a bit, as I do have some output that you can look at and compare to your needs. I had to change a few things to get it to run at all, so I'll list the new script:

    use strict; use diagnostics; use warnings; use autodie qw(open close); use Data::Dump qw(dump); #declare variables my %hash; my $data; #opening Files using autodie to cut typing... open my $in, '<', "Test_Data_RandNumbers.txt"; open my $in1, '<', "Test_Data_More_RandNumbers.txt"; open my $out, '>', "OUT_Test_Data_Ita_SpanFren_rest.txt"; open my $out1, '>', "OUT_Test_data_NO_match_SpanFren.txt"; #open my $out2,'>' , "./Test_Data_Out_None_Match.txt"; while (<$in>) { #data manipulation to clean up ='s and ,'s #dieci = diez, zehn -->worse case, remove spaces and = and comma; #quattro = quatro -->only one number with spaces or not in from of = +... chomp; my ( $ita, $spa, $num ) = split(/[=\s,]+/); # removes '=' or 's' or +',' & '+' to match 1 or more these characters $hash{$ita}[0] = $spa; #what about if there is no $num at position 1 in 1st file? if ( !defined $num ) { $hash{$ita}[1] = $num; # if defined then keep it for check later in position 2 + in array } } close $in; while (<$in1>) { chomp; my ( $ita, $fren, $num1, $num2 ) = split(/[=\s,]+/); #creates col +of numbers $hash{$ita}[1] = $fren; #now hashs format will look like this: ita= +> spa fren #define if there is are numbers in position 3 and 4 concat to p +osition 3 if ( defined($num1) and defined($num2) ) { $hash{$ita}[3] = $num1 . $num2; # concat if numbers defined in poition 1 and 2 and s +tore in 3 } elsif ( defined $num1 ) { $hash{$ita}[3] = $num1; #if array has num in pos 2 then save number in positio +n 3(dieci) } } close $in1; foreach my $ita ( keys %hash ) { if ( $hash{$ita}[0] and $hash{$ita}[1] ) { print $out "$ita =>", join( ',', @{ $hash{$ita} } ), "\n"; #line 5 +4 } else { print $out1 "$ita =>", join( ',', @{ $hash{$ita} } ), "\n"; } } close $out; close $out1;

    New data sets of same cardinality:

    Test_Data_More_RandNumbers.txt: due =deux, two, tre = trois,drei, attro = quatre,four cinque = cinq,funf, five sei = six , six sette = sept , seven ,sechs dieci =dix undici = onze,eleven tredici = treize,thirteen, dreizehn Test_Data_RandNumbers.txt: uno = uno,eins due = dos,zwei tre = tres,drei quattro = quatro cinque = cinco,funf sei = seis, sette = siete,sechs otto = ocho nouve = nueve, neun

    I changed the path of the file to be opened to be in the same as the script. "../" opens the parent, but I don't know what "./" was to open. I added a period where you indicated there was to be concatenation, which seems apropos. I changed the subscript 1 to correspond to the second position, as it usually does. Since I was drawing a warning that I was using an unitialized variable on line 54, I changed the data sets to be equally-long. I marked line 54 for those who might diagnose it better. I uncommented the close statements. Finally, I ran it through perltidy, which I heartily recommend. Here is the output:

    C:\Users\Fred\Desktop\pm>type OUT* OUT_Test_Data_Ita_SpanFren_rest.txt sei =>seis,six,,six quattro =>quatro,quatre,,four due =>dos,deux,,two sette =>siete,sept,,sevensechs cinque =>cinco,cinq,,funffive tre =>tres,trois,,dreithree OUT_Test_data_NO_match_SpanFren.txt dieci =>,dix uno =>uno nouve =>nueve otto =>ocho, undici =>,onze,,eleven tredici =>,treize,,thirteendreizehn

    Clearly there's some comma issues at a minimum, but there is output, and just having that can help you be specific about what you need to modify. Many happy adventures with perl.

      sorry, I didn't mean to make you guess.

      GOAL:find the common Italian number in both files that contains both Spanish and French translations (did that). Then,I added complexity to my learning by adding other number translations (german, english) and build the match file to contain these.

      lastly, to dump the numbers that don't have a matching Italian number in any.

      things I'd hope to learn.

      1) HASH_REF 2) Print HASH_REF 3)conditionals (if,else,define) with hash_ref 4)Concat ( now, I know needs a (.) THANKS TO YOU! although I saw an example where it didn't and that's what I followed) 5) play with SPLIT to remove =,spaces:/ and stuff.

      Questions for clarity

      :

      EXAMPLE snippet1 I modeled my concat syntax, why the Difference? why mine didn't work?

      chomp; my($col1,$col2,$rest)=split(/\t/); my $ckey="$col1$col2" #<= NO PERIOD!

      question 2, what does this piece do?can you explain your addition of "!"? I'm thinking, if the $num is UNDEF then go into loop...is this thinking correct?

      if (!defined $num){

      Output of Not a match of Italian number in either file has a printing bug I would like to remove. Please note, there is no French number in spot1 and so it gives ",," instead, how to remove these?

      nouve =>nueve,,neun uno =>uno,,eins

      I build my hash as follows:

      Italian => Spanish, French, whatever

      so in this case, one number is not found in the other and so the spot is just filled with ",,"

      NEED HELP:how to order the numbers on output?, any ideas???

      CORRECTED my hash structure like this since concat didn't print like I wanted

      my ($ita,$fren,$num1,$num2)= split(/[=\s,]+/); #creates col of num +bers $hash{$ita}[1]=$fren; #now hash_ref format will look like this: it +a=> spa , fren #define if there is are numbers in position 3 and 4 concat to posi +tion 3 if (defined ($num1)){ $hash{$ita}[3]=$num1; # % looks like this: Ita=> Spa, Fren, R +andNUM[3] } if (defined $num2){ $hash{$ita}[4]=$num2; # % looks like this Ita=> Span,Fren, Ra +ndnum[3]?(ifdefined), Randnum[4] } }

      somehow I feel there is a better way to do this...can hash of hashes be better? I don't know yet, I guess I need to learn that next...

      GRAZIE MILLE

        As a humble scribe, I welcome the opportunity to assist you in this matter. It's a challenge for me to understand exactly what you intend, and it is for you as well. The exercise of writing and debugging perl helps us to specify our intent, in particular, when informed by the script, input, and output. Having read your response, I've changed all three. I changed the input back to what you posted in the original post. I made several changes to the script; it's best just to list it before commenting too much:

        use strict; use diagnostics; use warnings; use autodie qw(open close); use Data::Dumper; use 5.010; #declare variables my %hash; my $data; #opening Files using autodie to cut typing... open my $in, '<', "Test_Data_RandNumbers.txt"; open my $in1, '<', "Test_Data_More_RandNumbers.txt"; open my $out, '>', "OUT_Test_Data_Ita_SpanFren_rest.txt"; open my $out1, '>', "OUT_Test_data_NO_match_SpanFren.txt"; #open my $out2,'>' , "./Test_Data_Out_None_Match.txt"; while (<$in>) { #data manipulation to clean up ='s and ,'s #dieci = diez, zehn -->worse case, remove spaces and = and comma; #quattro = quatro -->only one number with spaces or not in from of = +... chomp; my ( $ita, $spa, $num ) = split(/[=\s,]+/); say "values are $ita $spa $num"; $hash{$ita}[0] = $spa; #what about if there is no $num at position 1 in 1st file? if ( defined $num ) { $hash{$ita}[2] = $num; } } close $in; while (<$in1>) { chomp; my ( $ita, $fren, $num1, $num2 ) = split(/[=\s,]+/); say "values are $ita $fren $num1 $num2"; $hash{$ita}[1] = $fren; #now hash's format will look like this: ita +=> spa fren #define if there are numbers in position 3 and 4 concat to positio +n 3 if ( defined($num1) and defined($num2) ) { $hash{$ita}[3] = "$num1 $num2"; } elsif ( defined $num1 ) { $hash{$ita}[3] = $num1; } } close $in1; foreach my $ita ( keys %hash ) { if ( $hash{$ita}[0] and $hash{$ita}[1] ) { print $out "$ita =>", join( ',', @{ $hash{$ita} } ), "\n"; } else { print $out1 "$ita =>", join( ',', @{ $hash{$ita} } ), "\n"; } } print Dumper(\%hash); close $out; close $out1;

        I use the feature "say" to understand what values are going into the script. I get a lot of undefined values, but they don't "sink the ship," as it were. I re-wrote my concatenation syntax to mimic yours--I added a space in the middle of a quoted string--yet I wonder if this is actually what you want as far as the logic is concerned. Let's take a look at the output:

        C:\Users\Fred\Desktop\pm>type OUT* OUT_Test_Data_Ita_SpanFren_rest.txt sei =>seis,six,,six quattro =>quatro,quatre,,four due =>dos,deux,zwei,two dieci =>diez,dix,zehn sette =>siete,sept,sechs,seven sechs cinque =>cinco,cinq,funf,funf five undici =>once,onze,elf,eleven tre =>tres,trois,drei,drei three tredici =>trece,treize,dreizehn,thirteen dreizehn OUT_Test_data_NO_match_SpanFren.txt dodici =>doce uno =>uno,,eins nouve =>nueve,,neun otto =>ocho

        More informative might be the output from STDOUT:

        $VAR1 = { 'sei' => [ 'seis', 'six', '', 'six' ], 'quattro' => [ 'quatro', 'quatre', undef, 'four' ], 'due' => [ 'dos', 'deux', 'zwei', 'two ' ], 'dieci' => [ 'diez', 'dix', 'zehn' ], 'sette' => [ 'siete', 'sept', 'sechs', 'seven sechs' ], 'dodici' => [ 'doce' ], 'uno' => [ 'uno', undef, 'eins' ], 'nouve' => [ 'nueve', undef, 'neun' ], 'cinque' => [ 'cinco', 'cinq', 'funf', 'funf five' ], 'otto' => [ 'ocho' ], 'undici' => [ 'once', 'onze', 'elf', 'eleven' ], 'tre' => [ 'tres', 'trois', 'drei', 'drei three' ], 'tredici' => [ 'trece', 'treize', 'dreizehn', 'thirteen dreizehn' ] };

        I ask you to inspect the output. Aside from the commas, is this what you want? It seems to me that the commas is a cosmetic issue, while determining the fundamental logic is primary. What's more, I don't think you want concatenation, as it produces duplicate values. But, as a humble scribe, I have no means to scry.

Re: concatenation 2 values then pushing to a hash_ref
by Anonymous Monk on Jun 12, 2015 at 00:10 UTC