http://qs1969.pair.com?node_id=261632
Description: Well, we are all trying to filter spam. SpamAssasin gets high marks, but it requires you to download the mail first. I tried popsneaker(c-program), and it works great on headers, but misses the body, which isn't good enough anymore. So... I decided to try Net::POP3, which can get the headers and head a number of body lines. This is just my first working model, and I know it uses "goto's", but it was just the easiest way to break out of nested loops. The patterns are stored in files, to make it easy to just add them as necessary. It works quite fast on my dialup. To test, just run, and observe the output. For real deletion, uncomment the pop3->delete line near the bottom of the code.

The files at the end are where you put the regex patterns, about the only thing to watch for is backslashing '.'

It runs fast on my dialup line. I just run it before I download my mail with my normal mail client.

Update: I have changed the original post, to a slightly improved version. Aristotle's methods below, are superior to mine, but I think my "clunky method" is a little easier for a beginner to see what is happening and modify. I did eliminate the redundant "regex creation from file" code.

Update2 I've taken Aristotle's recommendations and implemented a greping method to avoid redundant line checks. I also have split the message into 2 arrays, @header and @top. Additionally, I fixed the exit routine so that there always is a logout from the pop server. The only thing that this code may miss, is if the various header lines have a newline in them, but I'm finding the first line of each header is good enough.

#!/usr/bin/perl -w
use strict;
use Net::POP3;

my $maxsize = 50000;
my (@fromgoodre,@frombadre,@badwordre,@maillistre,@badcontentre,@recei
+vedgoodre,@togoodre);

my %rehash = (
   'fromgood' => \@fromgoodre,
   'frombad' => \@frombadre,
   'badwords' => \@badwordre,
   'maillist' => \@maillistre,
   'contentbad' => \@badcontentre,
   'receivedgood' => \@receivedgoodre,
   'togood' => \@togoodre
);

foreach my $file (keys %rehash){
  open(FH,"< $file") or warn "Can't open $file $!\n";
  chomp (my @lines = <FH>);
     foreach my $line(@lines){push(@{$rehash{$file}},qr/$line/i)}
close FH;
}


my $ServerName = "mail.foobar.com";
my $pop3 = Net::POP3->new($ServerName)||die("Couldn't log on to server
+\n");
my $UserName = "zentara";
my $Password = "spam4lunch";

my $num_messages = $pop3->login($UserName, $Password)||die("Bad userna
+me or password\n");
my $messages = $pop3->list();
print "******$messages Number of messages->$num_messages*******\n";
my %messages;
my ($flag,$msg_id,$messcheck,@del_mess,@header,@top,$line);
@del_mess = ();
$messcheck = '';
print "####################################################\n";

foreach $msg_id(keys %{$messages}) {
     $flag = '';
     my @header =(); #array for headerlines 
     my @top = ();  #array for top of messagebody 
     my $messref = $pop3->top($msg_id,10);
     my $size = $pop3->list($msg_id);
     print "message$msg_id->size=$size\n";
     print "----------------------------------\n";

#split into @header and @top
     while(1){
       my $line = shift @$messref;
       last if $line =~ /^\s*$/;
       push(@header,$line);
     }
     @top = @$messref;
     print '~~~~~~~~~~~~~~~~~~@header ->',"\n@header\n";
     print '~~~~~~~~~~~~~~~~~~@top ->',"\n@top\n";


#check X-Mailinglist: 
          if(($line) = grep(/^X-Mailinglist:/o,@header)){
                 print "#################X-Mailinglistline-> $line\n";
                 for my $maillistre (@maillistre) {
                      if ($line =~ /$maillistre/) { print "Mail list g
+ood\n"; $flag = 'ok'; goto DOBODY}
                   }
          }
#check From 
           ($line) = grep(/^From:/io,@header);
           print "#################Fromgrepline-> $line\n";
           my $from = $line;
                for my $goodfromre (@fromgoodre) {
                      if ($line =~ /$goodfromre/) { print "GoodFrom->$
+goodfromre\n"; $flag = 'ok'; goto DOBODY}
                  }
                  for my $badfromre (@frombadre) {
                      if ($line =~ /$badfromre/) { print "BadFrom->$ba
+dfromre\n"; push(@del_mess,$msg_id); $messcheck .= "$msg_id -> $badfr
+omre ->$from\n"; $flag = 'bad'; goto DOBODY}
                  }

#check To and CC 
            ($line) = grep(/^(To|Cc|cc|Return-Path):/o,@header);
            print "#################To-CC-ReturnPathgrepline-> $line\n
+";
                 for my $togoodre (@togoodre) {
                      if ($line =~ /$togoodre/) { print "ToCCReturn-Pa
+th address good\n"; $flag = 'ok'; goto DOBODY}
                   }

#check Received         
            ($line) = grep(/^Received:/o,@header);
            print "#################Receivedgrepline-> $line\n";
                 for my $receivedgoodre (@receivedgoodre) {
                      if ($line =~ /$receivedgoodre/) { print "Receive
+d address good\n"; $flag = 'ok'; goto DOBODY}
                   }

#check Content-type 
            ($line) = grep(/^Content-Type:/o,@header);
            print "#################Content-typegrepline-> $line\n";
                 for my $badcontentre (@badcontentre) {
                      if ($line =~ /$badcontentre/) { print "Bad Conte
+nt->$badcontentre\n"; push(@del_mess,$msg_id); $messcheck .= "$msg_id
+ -> $badcontentre ->$from\n"; $flag = 'bad'; goto DOBODY}
                   }

#check Subject       
            ($line) = grep(/^Subject:/o,@header);
            print "#################Subjectgrepline-> $line\n";
                 for my $badwordre (@badwordre) {
                      if ($line =~ /$badwordre/) { print "badword in s
+ubject->$badwordre\n"; push(@del_mess,$msg_id); $messcheck .= "$msg_i
+d -> $badwordre ->$from\n"; $flag = 'bad'; goto DOBODY}
                  }

#check for base64                    
        if(($line) = grep(/^Content-Transfer-Encoding: base64/o,@heade
+r)){
            print "#################base64grepline-> $line\n";
            if ($line =~ /^Content-Transfer-Encoding: base64/o){print 
+"base64 content\n"; push(@del_mess,$msg_id); $messcheck .= "$msg_id -
+> base64 ->$from\n"; $flag = 'bad';goto DOBODY}
         }

#check Message-Id 
            ($line) = grep(/^Message-[Ii][Dd]:/o,@header);
            print "#################Message-Idgrepline-> $line\n";
            if (($line !~ /.*<?.+@.+(\..+)?>?$/o) || ($line =~ /\@127\
+.0\.0\.1/o))
             { print "Bad Message-Id\n"; push(@del_mess,$msg_id); $mes
+scheck .= "$msg_id -> bad Message-Id ->$from\n"; $flag = 'bad'; goto 
+DOBODY}

DOBODY:
if ($flag eq 'ok'){print "Message $msg_id is OK\n"; line(); next}
if ($flag eq 'bad'){print "Message $msg_id is bad\n"; delmessage($msg_
+id); line(); next}
if ($size > $maxsize){print "Size limit exceeded\n"; push(@del_mess,$m
+sg_id); $messcheck .= "$msg_id -> size limit exceeded ->$from\n"; $fl
+ag = 'bad'; goto DOBODY}
#do body check 
print "Doing body check\n";
    for my $line (@top){
         for my $badwordre (@badwordre) {
         if ($line =~ /$badwordre/) { print "badword in body->$badword
+re\n"; push(@del_mess,$msg_id); $messcheck .= "$msg_id -> bad word in
+ body ->$badwordre ->$from\n"; delmessage($msg_id); line(); goto FINI
+SH}
       }
    }
FINISH:
line();
print "\n";
}
#confirm the deletions 
line();line();line();line();
unless(defined $del_mess[0]){print "No messages to be deleted\nHit Ent
+er to quit\n"; my $input = <>; $pop3->quit(); exit}
print "$messcheck\n\n";
print "Delete above messages?\n  [yY] then Enter to delete, or Enter t
+o abort\n";
chomp(my $input = <>);
if($input =~ /^[yY]$/i){foreach(@del_mess){print "Deleting message $_\
+n"; $pop3->delete($_)}}

$pop3->quit();

sub delmessage{print "Message @_ to be deleted\n"}
sub line {print "####################################################\
+n"}
exit 0;
__END__

SAMPLE FILES:
###badwords######
jackpot
fee
visit
information
confident
confidential
urgent
action
please
notice
cheaper
virus
travel
nigeria
gabon
africa
afrique
\$\$\$
\*\*
\*
PROFITS
debt
INVEST
MONEY
CREDIT
FREE
CASH
BUY
\. \. \.
\.\.\.
party
penis
sex
viagra
prescription
###########################################

####contentbad#############################
text\/html
multipart\/alternative
multipart\/mixed
###########################################

######frombad##############################
.*\.ac>?$
.*\.ae>?$
.*\.af>?$
.*\.ag>?$
.*\.ai>?$
.*\.al>?$
.*\.am>?$
.*\.an>?$
.*\.ao>?$
.*\.aq>?$
.*\.ar>?$
.*\.as>?$
.*\.at>?$
.*\.by>?$
.*\.bz>?$
.*\.ca>?$
.*\.cc>?$
.*\.cd>?$
.*\.cf>?$
.*\.cg>?$
.*\.ch>?$
.*\.ci>?$
.*\.ck>?$
.*\.cl>?$
.*\.cm>?$
.*\.cn>?$
.*\.co>?$
.*\.cr>?$
.*\.cu>?$
.*\.cv>?$
.*\.cx>?$
#############################################

####fromgood#################################
mailgeek\.compgeeks\.com
zmuato
sbiblert
mcubase
mymother
myrelatives
##############################################

#########maillist###########################
suse-linux-e
#############################################

#################receivedgood#################
list2\.suse\.com
#############################################

#########togood##############################
sdlperl@sdlperl\.org
mc@gnome\.org
##############################################