Erase entries from files

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks!
I have a problem and I wanted to see if this can be done.
Suppose you have a file (FILE1) with a bunch of names, like

nick
george
peter
olga
adam
wolfgang
antony
george
mike
steven
[download]

and a smaller file (FILE2) with just 3 or 4 names from the big file, like

george
olga
steven
[download]

Is there a way, either to erase the names that appear in file2 from file1 of, if this can't be done,
to create a new file with just the remaining names from file1?
What I have tried is read files 1 and 2, create a hash from the names in file2 and then,
if the name I read as I iterate in file1 does not exist in the hash from file2, I print it in a new output file.
But my problem is that, by doing so, I get multiple entries from the names that appear only in file1...
Any hints?

Comment on Erase entries from files Select or Download Code

Replies are listed 'Best First'.
Re: Erase entries from files by dogz007 (Scribe) on Aug 14, 2007 at 21:16 UTC
This is a simple but very useful example of a problem that can be quickly solved using Tie::File. If you tie both files to their own arrays, searching and deleted between the two becomes easy. `use strict; use Tie::File; tie my @file1, 'Tie::File', 'file1.txt' or die; tie my @file2, 'Tie::File', 'file2.txt' or die; @file1 = grep { my $name = $_; ! grep {$_ eq $name} @file2 } @file1;` [download] That last line iterates over each line of `@file1` and only lets it hang around if it's not found in `@file2`. Using your example files, the code above transforms file1.txt into the follow: `nick peter adam wolfgang antony mike` [download] See node 629253 for links to tutorials on Tie::File.	[reply] [d/l] [select]
Re: Erase entries from files by FunkyMonk (Chancellor) on Aug 14, 2007 at 20:17 UTC
In principle your method sounds good enough. What about showing us your code (in `<code>...</code>` tags please)?	[reply] [d/l] [select]
Re: Erase entries from files by toolic (Bishop) on Aug 14, 2007 at 21:05 UTC
If you use nix, you could just use comm*: `sort -u file1 > file1.sorted sort -u file2 > file2.sorted comm -23 file1.sorted file2.sorted` [download] Produces this output: `adam antony mike nick peter wolfgang` [download] I know, I know... you're just dying to use Perl. Who isn't? :) Update: fixed typo	[reply] [d/l] [select]
Re: Erase entries from files by suaveant (Parson) on Aug 14, 2007 at 21:02 UTC
When you find a match, delete it from the hash, then it can't be matched again later. - Ant - Some of my best work - (1 2 3)	[reply]
Re: Erase entries from files by mamawe (Sexton) on Aug 14, 2007 at 21:19 UTC
I don't understand from your text wether you want duplicate entries from file 1 (as you would get if you simply erased the names from file 2) or not (as you told in the last sentence). If you want them your method is fine. If you don't want them, simply add the names from file 1 to your hash when you encounter them the first time, so you would not take them again.	[reply]