Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks!
I have a problem and I wanted to see if this can be done.
Suppose you have a file (FILE1) with a bunch of names, like
nick george peter olga adam wolfgang antony george mike steven
and a smaller file (FILE2) with just 3 or 4 names from the big file, like
george olga steven
Is there a way, either to erase the names that appear in file2 from file1 of, if this can't be done,
to create a new file with just the remaining names from file1?
What I have tried is read files 1 and 2, create a hash from the names in file2 and then,
if the name I read as I iterate in file1 does not exist in the hash from file2, I print it in a new output file.
But my problem is that, by doing so, I get multiple entries from the names that appear only in file1...
Any hints?

Replies are listed 'Best First'.
Re: Erase entries from files
by dogz007 (Scribe) on Aug 14, 2007 at 21:16 UTC
    This is a simple but very useful example of a problem that can be quickly solved using Tie::File. If you tie both files to their own arrays, searching and deleted between the two becomes easy.

    use strict; use Tie::File; tie my @file1, 'Tie::File', 'file1.txt' or die; tie my @file2, 'Tie::File', 'file2.txt' or die; @file1 = grep { my $name = $_; ! grep {$_ eq $name} @file2 } @file1;

    That last line iterates over each line of @file1 and only lets it hang around if it's not found in @file2. Using your example files, the code above transforms file1.txt into the follow:

    nick peter adam wolfgang antony mike

    See node 629253 for links to tutorials on Tie::File.

Re: Erase entries from files
by FunkyMonk (Chancellor) on Aug 14, 2007 at 20:17 UTC
    In principle your method sounds good enough. What about showing us your code (in <code>...</code> tags please)?

Re: Erase entries from files
by toolic (Bishop) on Aug 14, 2007 at 21:05 UTC
    If you use *nix, you could just use comm:
    sort -u file1 > file1.sorted sort -u file2 > file2.sorted comm -23 file1.sorted file2.sorted
    Produces this output:
    adam antony mike nick peter wolfgang
    I know, I know... you're just dying to use Perl. Who isn't? :) Update: fixed typo
Re: Erase entries from files
by suaveant (Parson) on Aug 14, 2007 at 21:02 UTC
    When you find a match, delete it from the hash, then it can't be matched again later.

                    - Ant
                    - Some of my best work - (1 2 3)

Re: Erase entries from files
by mamawe (Sexton) on Aug 14, 2007 at 21:19 UTC
    I don't understand from your text wether you want duplicate entries from file 1 (as you would get if you simply erased the names from file 2) or not (as you told in the last sentence).

    If you want them your method is fine.

    If you don't want them, simply add the names from file 1 to your hash when you encounter them the first time, so you would not take them again.