Ovid has asked for the wisdom of the Perl Monks concerning the following question:

One task I've been assigned at my new job is to find a reliable way of stopping spammers from exploiting an issue with Email-Simple. Basically, the email looks like this:

To: GetADiplomaOnline Content-Type: multipart/alternative; boundary=be638aa04b654852d0173c0e3f9b6d20 From: some@email.address.co.uk to: @huge_list_of_email_addresses

Because we have both a To: and a to: header (note case), Email::Simple reports the To: header ("GetADiplomaOnline") when doing this:

my $email = Email::Simple->new($email_text); my @headers = $email->header('to');

If the case of the headers were the same, then I'd get both headers. Because I don't get both headers, our validation checks ignore the second header. Peeking inside the email object reveals this:

'header_names' => { 'content-type' => 'Content-Type', 'to' => 'To', 'from' => 'From', 'subject' => 'Subject' }, 'order' => [ 'To', 'Content-Type', 'From', 'to', 'Subject' ]

Further research reveals the this module parses both headers, though it doesn't report both in this case. RFC 2822 says we can't have more than one header for to: (assuming I read it correctly), so I'm guessing that one way to stop this spam attack is to disallow email which has more than one to:, cc: or bcc: header.

Is this a reasonable approach? If so, how should I go about this? I don't want to reach into Email::Simple's internals to test this, but I have so little experience in this area that I'm not sure what best practices are.

Note: Email::Simple::Headers does not report the extra to: header, so that is not an option.

Cheers,
Ovid

New address of my CGI Course.

Replies are listed 'Best First'.
Re: Spammers exploiting Email::Simple
by xdg (Monsignor) on Jun 19, 2006 at 13:05 UTC

    Out of curiousity, have you tried seeing what Mail::Message makes of it? (Not suggesting that your $job would let you switch, just curious about the side-by-side comparison on the same, broken email.)

    My frequent worry about Email::Simple is that it might be too simple in how it deals with things that aren't standards compliant. I've had good luck with Mail::Message in the past -- I've accepted its heavier weight and learning curve for the robustness I've seen on broken emails.

    The following code should be a quick test if you've got it installed. (Read STDIN, print addresses to STDOUT).

    use strict; use warnings; use Mail::Message; local $\ = "\n"; my $msg = Mail::Message->read(\*STDIN); print $_->format for $msg->to;

    Update: I went ahead and quickly tried it on a sample email -- adding an extra "to:" header after all the other headers. The sample code above gave both the "To:" and "to:" addresses.

    -xdg

    Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

Re: Spammers exploiting Email::Simple
by bart (Canon) on Jun 19, 2006 at 13:20 UTC
    AFAIK, "To" and "to" headers should be equivalent. The common thing to do, in my experience, is to convert every header to title case, for example "Content-Type" which is the equivalent of any from "content-type", "CONTENT-TYPE", ...

    In other words: I think it's a bug in Email::Simple, as it doesn't seem to convert the headers to a standard form. I do see a line

    header_names => { map { lc $_ => $_ } keys %$head_hash }
    in the sub new, but that doesn't change the contents of the hash %$head_hash, which is used verbatim.
    head => $head_hash,
Re: Spammers exploiting Email::Simple
by jdtoronto (Prior) on Jun 19, 2006 at 18:30 UTC
    Some time back we coded a module using Email::Simple and some of its companions. Like someone else has suggested we became suspicious of its simplicity. Subsequently we replaced it with another written using Mail::Message, Mail::Header & Mail::Address with Mime::Parser.

    Although we would value the light weight of the newer modules, I am not convinced that they are yet up to "industrial standard".

    jdtoronto