Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello fellow monks!
Let's say you have a file named FILE1, with 10 names in it,
and a FILE2 with 100 names in it.
You know that the 10 names from FILE1 are contained in FILE2.
How would you go if you want to keep tha remaining 90 from FILE2?
I create an array with the 10 names and I start scanning FILE2. How will I keep the rest names from FILE2 and exclude the 10 names from FILE1?
Thank you very much!

Replies are listed 'Best First'.
Re: remove entries that match
by Corion (Patriarch) on Jul 12, 2007 at 06:39 UTC

    This is a FAQ. Please type perldoc -q intersection or see perlfaq4, "How do I compute the difference of two arrays".

Re: remove entries that match
by Samy_rio (Vicar) on Jul 12, 2007 at 06:39 UTC

    Hi, read the two files names in two different arrays and compare the array using List::Compare module. It will give the presence of extra names in both the files as below.

    use strict; use warnings; use List::Compare; my @file1 = qw(name1 name2 name3 name4 name5 name6 name7 name8 name9 n +ame10); my @file2 = qw(name1 name2 name3 name4 name5 name6 name7 name8 name9 n +ame10 name11); my $lcma = List::Compare->new(\@file1, \@file2); print $lcma->get_complement; # extra present in file2 print $lcma->get_unique; # extra present in file1

    Regards,
    Velusamy R.


    eval"print uc\"\\c$_\""for split'','j)@,/6%@0%2,`e@3!-9v2)/@|6%,53!-9@2~j';

Re: remove entries that match
by mickeyn (Priest) on Jul 12, 2007 at 07:09 UTC
    'remaining 90' is relevant to 100 unique names in FILE2.
    assuming, that's not exactly what you meant (remaining can be lower than 90 if there are repetitions in FILE2), this code should work for you:
    use strict; use Tie::File; # read first file open my $fh, '<', 'FILE1' or die "can't open FILE1\n"; chomp(my @file1 = <$fh>); close $fh; # remove from second file tie my @file2, 'Tie::File', 'FILE2' or die "can't open FILE2\n"; foreach my $x (@file1) { @file2 = grep {!/^$x$/} @file2; } untie @file2;

    HTH,
    Mickey

Re: remove entries that match
by sgt (Deacon) on Jul 12, 2007 at 07:44 UTC

    Tough not to mention a un*x golf solution ;)

    $ comm -12 <(sort FILE1) <(sort FILE2) > OUT # comm gives 3 cols # 1: lines only in first file # 2: lines only in second file # 3: common lines # flag -# suppresses col#

    magic <(...) is to save a temporary file. If your shell does not support it just do 'sort FILE1 -o FILE1.s' comm needs sorted files for its magic.

    cheers --stephan