iterating through a hash?

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks!
I have two files, in the following formats:

FILE1
peter  1234
nick   1111
john   4567
mike   3333
george 2222
antony 5632
migel  1209

FILE2
john   7559
mike   3333
george 2222
peter  5643
nick   1111
julia  3456
[download]

What I want to do is:
Read through FILE1 and store name and number in a hash, call it %hash1 [I have done it]
Foreach $key of %hash1, I must check if it exists in FILE2. If it exists, I must then check if the number in FILE1 [$hash1{$key}] is the same with the number in FILE2 {$hash2{$key}}. If it is the same, I print $key."\t".'OK'."$hash1{$key}";
and if the numbers don't match, I will print $key."\tWRONG".$hash1{$key}.
Also, if a $key of $hash1 does not exist in FILE2, I must print it.
To be more clear, I must have:

peter  WRONG   1234
nick   OK      1111
john   WRONG   4567
mike   OK      3333
george OK      2222
antony WRONG   5632
migel  WRONG   1209
[download]

I have written:

open ONE, $FILE1;
while (<ONE>)
{
    chomp;
    if ($_=~/^(.*)\t(.*)/)
    {
        $hash1{$1} =$2;
    }
}

foreach $key(keys %hash1)
{
    open TWO, $FILE2;
    while (<TWO>)
    {
    chomp;
        if ($_=~/^$key\t(.*)/)
            {
            if ($1 eq $hash1{$key})
                {
                print $key."\t".'OK'."$hash1{$key}"."\n";
                }
            else 
                {                
                print $key."\t".'WRONG'."\t".$hash1{$key}."\n";
                }
            }
            
    }
}
[download]

What I cannot do is print all names of FILE1 that ARE NOT in FILE2.
Any hints?

Code tags added by GrandFather

Comment on iterating through a hash? Select or Download Code

Replies are listed 'Best First'.
Re: iterating through a hash? by shmem (Chancellor) on Sep 17, 2006 at 10:08 UTC
Why do you open file2 for every key in %hash1, go through that file from start to end to only look at the key at one line? It suffices to open file2 once, read in key/value pairs (as you did with file1); whilst doing so you can find out whether the current key exists in file1: `open TWO, '<', $FILE2; # better be explicit ;-) while (<TWO>) { chomp; # correct indent level is here. my ($key,$value) = split /\s+/,$_; print "$key NOT IN FILE 1\n" unless $hash1{$key}; $hash2{$key} = $value; }` [download] Now you have `%hash1` and `%hash2` populated; you can iterate over the key lists and see if they exist in each other, and if so, whether their values are identical: `foreach my $key (keys %hash1) { if(exists($hash2{$key}) { if($hash1{$key} == $hash2{$key}) # '==' if numeric, 'eq' if + string { ... } else { ... } } else # this is the case you missed - $key not in file2 { ... } }` [download] --shmem _($_=" "x(1<<5)."?\n".q·/)Oo. G°\ / /\_¯/(q / ---------------------------- \__(m.====·.(_("always off the crowd"))."· ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}	[reply] [d/l] [select]
Re^2: iterating through a hash? by Cristoforo (Curate) on Sep 17, 2006 at 20:22 UTC
The solution could even use just 1 hash if the whitespace separating the name and number is the same (same number of tabs or spaces). `#!/usr/bin/perl use strict; use warnings; open FH2, "o44.txt" or die $!; my %file2 = map{ $_ => 1} <FH2>; close FH2 or die $!; open FH1, "o33.txt" or die $!; while (<FH1>) { s/\s+/$file2{ $_ } ? " OK " : " WRONG "/e; print; } close FH1 or die $!;` [download]	[reply] [d/l]
Re^2: iterating through a hash? by graff (Chancellor) on Sep 17, 2006 at 21:28 UTC
Cristoforo is right about needing only one hash to do this, but it doesn't need to depend on having identical whitespace in the two files: use strict; my %hash; open( F, "<", "file1" ) or die "file1: $!"; while (<F>) { my ( $name, $val ) = split; $hash{$name} = $val; } open( F, "<", "file2" ) or die "file2: $!"; while (<F>) { my ( $name, $val ) = split; if ( exists( $hash{$name} ) { my $status = ( $hash{$name} eq $val ) ? 'OK':'WRONG'; print "$name\t$status\t$hash{$name}\n"; delete $hash{$name}; # don't need this anymore } } # any keys left in hash are cases where a file1 name was not found in +file2 # so print those now: if ( keys %hash ) { print "\nNames not found in file2:\n"; print "$_\t$hash{$_}\n" for ( sort keys %hash ); } [download] (updated the code so that the names are always printed with the values from file1, as per the OP spec.) Nothing happens for names in file2 that don't exist in file1, but the OP didn't say that anything needed to be done for those. There was also nothing said about the same name occurring more than once in either file, but maybe the AM won't need to worry about that...	[reply] [d/l]
Re^2: iterating through a hash? by Anonymous Monk on Sep 17, 2006 at 10:23 UTC
Thank you very much! You are correct, I should have used 2 hashes instead, it's much easier that way...	[reply]
Re: iterating through a hash? by Anonymous Monk on Sep 17, 2006 at 10:42 UTC
`use List::Compare; for (List::Compare->new([keys %h1], [keys %h2])->get_Lonly) { print "$_ $h1{$_}\n"; }` [download]	[reply] [d/l]
Re^2: iterating through a hash? by Cristoforo (Curate) on Sep 17, 2006 at 20:12 UTC
This doesn't list each item from file1. It only lists those items in file1 not in the second file.	[reply]
Re: iterating through a hash? by Not_a_Number (Prior) on Sep 17, 2006 at 23:12 UTC
`use strict; use warnings; open my $fh, '<', $FILE2 or die "Can't open file2: $!\n"; my %hash = map split, <$fh>; open $fh, '<', $FILE1 or die "Can't open file1: $!\n"; while ( <$fh> ) { my ( $k, $v ) = split; my $ok = ( defined $hash{$k} and $hash{$k} == $v ) ? 'OK' : 'WRONG' +; print join( "\t", $k, $ok, $v ), "\n"; }` [download] Update: Output, as per spec (tabs being equal): `peter WRONG 1234 nick OK 1111 john WRONG 4567 mike OK 3333 george OK 2222 antony WRONG 5632 migel WRONG 1209` [download]	[reply] [d/l] [select]
Re: iterating through a hash? by graff (Chancellor) on Sep 17, 2006 at 21:40 UTC
shmem is right about how you should open the second file only once, and as other sub-replies there point out, just one hash is enough to do everything you want. After you read the first file into a hash, you read each line of the second file, but instead of storing this, you check if each name exists in the hash; if so, print "OK" or "WRONG" according to whether the values match and then delete that hash element. When done reading the second file, any remaining hash elements are names in file1 that were not in file2 -- so that is where you iterate through the hash, to print those out at the end.	[reply]