in reply to Re: Comparing / Searching through Hashes
in thread Comparing / Searching through Hashes

Thanks for the clarification. In that case, I think you're on the right track: put fileB in a hash, then go through fileA checking each key from fileA for existence as a key in fileB. That's the standard idiom for this kind of thing, but in your case there will be the extra step that once you find a match on the keys from the first column, you'll also need to check for a match on the second column. That might look something like the code below (untested). The tricky part may be that inner if comparison. In mine, I'm just testing to see if either value is found as a substring in the other. If you need something more sophisticated, you'll have to adjust that there.

# %b is a hash already containing the values from fileB, with the # first column as keys and the second column as values. # $file_of_matches is a file descriptor opened to one output file # $file_of_misses is a file descriptor for the other output file open my $fileA, '<', 'fileA' or die $!; while( my $line = <$fileA> ){ # get a line from fileA chomp $line; my( $k, $v ) = split /\t/, $line; # split the line on tab if( $b{$k} ){ # do first columns match? if( $b{$k} =~ /$v/ or $v =~ /$b{$k}/ ){ # does one second column v +alue contain # the other as a substring +? print $file_of_matches "$line\n"; # yes, so print it to the +match file next; # and loop to the next lin +e } } print $file_of_misses "$line\n"; # no, so print it to the n +on-match file }

By the way, note that this:

while( my $line = <$fileA> ){ # do stuff with $line # replaces this: while( <$fileA> ){ my $line = $_; # do stuff with $line

It saves a line and avoids potential bugs that may be caused by using $_ sort of halfway.

Aaron B.
Available for small or large Perl jobs; see my home node.