originalkilroy has asked for the wisdom of the Perl Monks concerning the following question:

#!/usr/bin/perl open(Infile,$ARGV[0]); while(<Infile>){ #Rule 1 test for love in subject of spam. if (/^Subject:.*(love).*\n/i){ #dosomething } if (/^Subject:.*(pharmacy).*\n/i){ #dosomething } if (/^Subject:.*(adobe).*\n/i){ #dosomething } if (/^Subject:.*(erection).*\n/i){ #dosomething } if (/^Subject:.*(sexual).*\n/i){ #dosomething } if (/^Subject:.*(penis).*\n/i){ #dosomething } if (/^Subject:.*(shag).*\n/i){ #dosomething } if (/^Subject:.*(bed).*\n/i){ #dosomething } if (/^Subject:.*(rolex).*\n/i){ #dosomething } if (/^Subject:.*(Re.*:).*\n/i){ #dosomething } if (/^Subject:.*(viagra).*\n/i){ #dosomething } if (/^Subject:.*(weight).*\n/i){ #dosomething } if (/^Subject:.*(drugs).*\n/i){ #dosomething } if (/^Subject:.*(deals).*\n/i){
What are we going to do tonight Brain? What we do every night Pinky take over the world.

Replies are listed 'Best First'.
Re: I need a simple spam fitler to sort through about 500 files
by Tanktalus (Canon) on Mar 26, 2005 at 17:06 UTC
Re: I need a simple spam fitler to sort through about 500 files
by jhourcle (Prior) on Mar 26, 2005 at 17:16 UTC

    There are enough spam filters, or otherwise general purpose mail filters with spam rulesets that it'd be pointless to reinvent the wheel.

    Especially because we don't know how to have the messages stored (500 files may be 500 mbox format mailboxes, or 500 messages, or 500 Outlook mailboxes, etc.)

    spam.abuse.net has a rather long list of mail filters. Many are SMTP based, but you can always just set something up to reprocess the mail.

    If you just need a program for filtering mail, based on whatever rules you're using, you could try Mail::Procmail. I've never used the module, but I've been using the original procmail implementation for years for sorting mailing lists, spam, etc.

    You might also search CPAN for spam which pulls up a lot of hits.

      All I need to do is filter already given spam from an INBOX folder to a SPAM folder in the process delete and rename the file. When I need a simple patteren and a set of rules to write to the header of thae spam as it is written back to the spam.<filename> in the SPAM folder. Thank you, OriginalKilroy
      What are we going to do tonight Brain? What we do every night Pinky take over the world.
Re: I need a simple spam fitler to sort through about 500 files
by cog (Parson) on Mar 26, 2005 at 17:11 UTC

    Always start with:

    use warnings; use strict;

    This will catch half (or more) of the mistakes you're likely to have in any piece of code.

    Next: are you going to do the same thing for every kind of spam? If so, you could use just one single regular expression to catch them all.

    OTOH, remember that spammers are nowadays including deliberate spelling errors so that this kind of method doesn't work...

Re: I need a simple spam fitler to sort through about 500 files
by Joost (Canon) on Mar 26, 2005 at 19:28 UTC
Re: I need a simple spam fitler to sort through about 500 files
by chas (Priest) on Mar 26, 2005 at 19:29 UTC
    Ditto on what everyone else said about using existing tools. One problem with what you are doing (and I've done such things myself and discovered the error of my ways) is that you'll get subjects containing: v1agra, vi@gra, v_iagra, v i a g r a, v~iagra, etc, etc. It's really tough to match everything of that sort and not produce a lot of false positives; one can come close, but then it becomes a full time job.
    chas
Re: I need a simple spam fitler to sort through about 500 files
by ambs (Pilgrim) on Mar 26, 2005 at 17:04 UTC
    Can't you post again your question with a decent title and a good description? Why don't you put the keywords for viagra, drugs, deals and so on in an array so you can join them all together in a single action?

    Alberto Simões