comment on

If you are using SpamAssassin then the below examples should help you:

#
# MultiReceived
# Written By: Roy Elton Crowder, III (roy.crowder@gmail.com)
# Date: 12 March 2009
# Company: WorldSpice Technologies
#                   5050 Poplar Avenue, Suite 170
#                   Memphis, TN 38111
#                   Tollfree Number: (866) 466-7733
# Description:
#   This script was written to count the number of "Received: from" in
+ a
#   header. We were getting emails that were bouncing from multiple em
+ail
#   servers before they came to us. Spammers found that they could get
#   through this way. If more than one "Received: from" are found then
#   the script returns 1 (true) which tells SpamAssassin to assign poi
+nts
#   to the email in question.
#
# If you have any questions please feel free to email me at the email 
+above.
#


package MultiReceived;
1;

use strict;

# Module imports
use Mail::SpamAssassin;
use Mail::SpamAssassin::Plugin;

# Inheritance
our @ISA = qw(Mail::SpamAssassin::Plugin);

# Subroutine new
sub new {
   my ($class, $mailsa) = @_;

   # Create the object
    $class = ref($class) || $class;
   my $self = $class->SUPER::new( $mailsa );
   bless ($self, $class);

   # Register the object's subroutine with SpamAssassin as a Plugin
    $self->register_eval_rule ( 'check_for_multiple_received' );

   return $self;
}


#
# check_for_multiple_received
# Parameters:
# $self
# $msg
#
sub check_for_multiple_received {
   # $msg is an object from Mail::SpamAssassin::PerMsgStatus
    my ($self, $msg) = @_;

   # Get the entire header.
    my $header = $msg->get( 'ALL' );

   # Split the header on new lines.
    my @h = split(/\n/, $header);

   # Counting Variable
    my $num_received = 0;

   # Count the number of "Received: from" there are.
    foreach (@h) {
       # Regex to match against each line of the
        # header. If a "Received: from" is found,
        # add 1 to the count.
        if ($_ =~ /\s*Received:\s* from/) {
           $num_received = $num_received + 1;
       }

       # If more than 1 "Received: from" is found,
        # do not continue, return 1 (true) to assign
        # points.
        if ($num_received > 1) {
           return 1;
       }
   }

   # If we made it this far, the email was good
    return 0;
}

#
# MultiNewLine
# Written By: Roy Elton Crowder, III (roy@worldspice.net)
# Date Written: 24 March 2009
# Company: WorldSpice Technologies
#                   5050 Poplar Avenue, Suite 170
#                   Memphis, TN 38111
#                   Tollfree Number: (866) 466-7733
#
# Description:
#   This SpamAssassin plugin is written to handle emails that have a c
+ontinuous
#   set of \n (newline) characters. We have set this plugin to catch a
+ny email
#   that has 10 or more continuous \n characters. It is pretty common 
+to see 2-3
#   \n characters towards the end of an email for signature purposes b
+ut anything
#   beyond 10 is considered spam.
#

package MultiNewLine;
1;

use strict;

use Mail::SpamAssassin;
use Mail::SpamAssassin::Message;
use Mail::SpamAssassin::Plugin;
our @ISA = qw(Mail::SpamAssassin::Plugin);

# new is used to instantiate a new SpamAssassin plugin
sub new {
   my ($class, $mailsa) = @_;
   $class = ref($class) || $class;
   my $self = $class->SUPER::new( $mailsa );
   bless ($self, $class);
   $self->register_eval_rule ( 'check_for_multiple_newline' );

   return $self;
}

#
# check_for_multiple_newline
# Parameters:
# $self
# $msg
#
sub check_for_multiple_newline {
   my ($self, $msg) = @_;

   # The $msg variable is parameterized as a PerMsgStatus object.
    # To get the body of the email we must first get a
    # Mail::SpamAssassin::Message object. This is done by using the
    # get_message() subroutine defined under the PerMsgStatus object.
    my $message = $msg->get_message();

   # Now that we have an actual message object, we can get the body.
    my $body = $message->get_body();

   # Variables
    my $nl_count = 0;
   my $found = 0;

   # We now can parse through the body, line by line, and count the
    # number of \n characters. We want a continuous set of \n characte
+rs
    # thus you see some additional checks within the loop.
    foreach (@$body) {
       # We mark $found as true when a \n character is found one a lin
+e
        # by itself. If we have already found a \n on a line by itself
+ then
        # each subsequent \n character we find on a line by itself wil
+l up
        # the count by one. If we come across a line that has somethin
+g other
        # than a \n after a \n has been found on a line by itself then
+ we set
        # the count back to zero and found is false.
        if ($found) {
           if ($_ =~ /^\n$/) {
               $nl_count = $nl_count + 1;
           } else {
               $nl_count = 0;
               $found = 0;
           }
      } else {
           if ($_ =~ /^\n$/) {
               $nl_count = 1;
               $found = 1;

           }

      }

       # If our count is greater than or equal to 10, add points. Othe
+rwise,
        # continue parsing the message.
        if ($nl_count >= 10) {
           return 1;
       }
   }

 # If we have gotten this far, the message is legit as
  # far as this module is concerned, don't add points.
  return 0;
}
[download]

You can find complete explanations on these here.

The Web is like a dominatrix. Everywhere I turn, I see little buttons ordering me to Submit. (Nytwind)

In reply to Re: Extract IP from email dataset? by RoyCrowder
in thread Extract IP from email dataset? by bharadwajh

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.