I wanted to take all the email addresses in an "exclude" file out of the main (mailing list) file, I tried:
grep -vf exclude.list mail.list > new.list
It took HEAPS of memory and ran for about half an hour on my dual proc PIII 866
I thought I'd chance re-writing it in perl and it took 25 seconds to run and produce the same result !
The same thing written in a bash shell script using a for loop with a grep and checking the exit code took 4.5 minutes to run.
Long live perl !!!
#!/usr/bin/perl -w
#
# only-in
# find lines which are in the first file, but not in the second.
#
use strict;
die "Usage: $0 INPUT EXCLUDE\n" unless($#ARGV == 1);
my $input_file = shift;
my $exclude_file = shift;
open (INPUT, $input_file ) ||
die("Can't open input file '$input_file': $!\n");
my @input = (<INPUT>);
close(INPUT);
open (EXCLUDE, $exclude_file ) ||
die("Can't open exclude file '$exclude_file': $!\n");
my @exclude = (<EXCLUDE>);
close(EXCLUDE);
my @good;
for my $data (@input) {
push (@good, $data) unless(grep /^$data$/i, @exclude);
}
print join("", @good);