in reply to Comparing two columns

The following makes no assumption about which of your input files has duplicates. It demonstrates the use of a hash of arrays to capture multiple values for each key. You may be able to adapt it to produce the results you need.

my %data; foreach my $line (@ge_data) { my ($id, $rest) = split(/\s+/,$line,2); push(@{$data{$id}{ge_data}}, $rest); } foreach my $line (@file_data) { my ($id, $rest) = split(/\s+/,$line,2); push(@{$data{$id}{file_data}}, $rest); } foreach my $id (sort keys %data) { next unless(exists $data{$id}{ge_data}); next unless(exists $data{$id}{file_data}); print "$id:\n"; print "\tge_data:\n"; print "\t\t" . join("\t\t",@{$data{$id}{ge_data}}); print "\tfile_data:\n"; print "\t\t" . join("\t\t",@{$data{$id}{file_data}}); }