I wanted to take all the email addresses in an "exclude" file out of the main (mailing list) file, I tried:
grep -vf exclude.list mail.list > new.list
It took HEAPS of memory and ran for about half an hour on my dual proc PIII 866
I thought I'd chance re-writing it in perl and it took 25 seconds to run and produce the same result !
The same thing written in a bash shell script using a for loop with a grep and checking the exit code took 4.5 minutes to run.
Long live perl !!!
#!/usr/bin/perl -w
#
# only-in
# find lines which are in the first file, but not in the second.
#
use strict;
die "Usage: $0 INPUT EXCLUDE\n" unless($#ARGV == 1);
my $input_file = shift;
my $exclude_file = shift;
open (INPUT, $input_file ) ||
die("Can't open input file '$input_file': $!\n");
my @input = (<INPUT>);
close(INPUT);
open (EXCLUDE, $exclude_file ) ||
die("Can't open exclude file '$exclude_file': $!\n");
my @exclude = (<EXCLUDE>);
close(EXCLUDE);
my @good;
for my $data (@input) {
push (@good, $data) unless(grep /^$data$/i, @exclude);
}
print join("", @good);
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.