Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Helo fellow monks!
I have started using Perl a couple of weeks ago, and I am now faced with my first serious problem.
I have 2 files let's say with words,namely
FILE1 one two three four five six seven ############# FILE2 <nine> two eleven twenty one thirty forty
Waht I want is to create 3 files, the FIRST will store only the words that appear in FILE1,
the SECOND will store only the words that appear in FILE2 and the third will store the common words.
I know how to open files and create arrays or hashes with the words they contain, and I have created 2 hashes one for each file.
What I don't know is how can I compare the 2 hashes and find the common and not common words so that I print them in the respective output files.
Thank you in advance!

Replies are listed 'Best First'.
Re: how to find common and not common lines in 2 files?
by Roy Johnson (Monsignor) on Aug 21, 2007 at 21:08 UTC
    So you want to go through hash1 and separate what's common from what's unique. Then you want to find the unique things in hash2.
    my (@common, @uniq1, @uniq2); for (keys %hash1) { if (exists $hash2{$_}) { push @common, $_; delete $hash2{$_}; # All that will be left in hash2 is what wasn' +t in hash1 } else { push @uniq1, $_ } } @uniq2 = keys %hash2;

    Caution: Contents may have been coded under pressure.
Re: how to find common and not common lines in 2 files?
by akho (Hermit) on Aug 21, 2007 at 22:47 UTC
    Better done using a single hash (didn't test this, use at your own risk. Should work, though):
    use strict; use warnings; my %in_files; open(my $f1, '<', 'FILE1') or die "can't open FILE1: $!\n"; while (<$f1>) { $in_files{$_} .= '1'; } open(my $f2, '<', 'FILE2') or die "can't open FILE2: $!\n"; while (<$f2>) { $in_files{$_} .= '2'; } open(my $common, '>', 'common_lines') or die "can't open common_lines: + $!\n"; open(my $u1, '>', 'unique_1') or die "can't open unique_1: $!\n"; open(my $u2, '>', 'unique_2') or die "can't open unique_2: $!\n"; for (keys %in_files) { if ($in_files{$_} =~ m/12/) { print $common $_ } else if ($in_files{$_} =~ m/1/) { print $u1 $_ } else { print $u2 $_ } }
Re: how to find common and not common lines in 2 files?
by toolic (Bishop) on Aug 21, 2007 at 21:07 UTC
    There was a recent node which addressed a similar problem. See Erase entries from files.

    If you use *nix, you could just use comm:

    sort -u FILE1 > FILE1.sorted sort -u FILE2 > FILE2.sorted comm -23 FILE1.sorted FILE2.sorted > FIRST comm -13 FILE1.sorted FILE2.sorted > SECOND comm -12 FILE1.sorted FILE2.sorted > common
Re: how to find common and not common lines in 2 files?
by dogz007 (Scribe) on Aug 21, 2007 at 21:27 UTC
    Here's a one liner (at least the real work is done in one line) that will get the job done. It ties all three files to arrays and then greps through them to find the common ones.

    use strict; use Tie::File; tie my @f1, 'Tie::File', 'file1.txt' or die; tie my @f2, 'Tie::File', 'file2.txt' or die; tie my @f3, 'Tie::File', 'file3.txt' or die; @f1 = grep { my $word = $_; my $size = $#f2; @f2 = grep { if ($_ eq $word) { push @f3, $_; 0 } else { 1 } } @f2; $size == $#f2; } @f1;

    Outputs the following for your example files:

    file1.txt

    three four five six seven

    file2.txt

    nine eleven twenty thirty forty

    file3.txt

    one two
Re: how to find common and not common lines in 2 files?
by misc (Friar) on Aug 21, 2007 at 21:45 UTC
    Since you did say you started with perl a few weeks ago,
    I'd like to give you just a hint.
    Why don't you read the second file line by line and ...

    Besides, as this was your question, you could find common and uncommon keys of two hashes by using a foreach loop, iterating over the keys of hash a and testing if it's present in hash b.

    The hash category in perlfunc should also help you.
    Did you miss exists ?

    michael