dgaramond2 has asked for the wisdom of the Perl Monks concerning the following question:

I have several production servers running a home-grown SMTP daemon written in Perl. It is actually an SMTP proxy for qmail-smtpd, forks per client, and uses Net::SMTP::Server (which uses IO::Socket). +- 1500 lines.

Lately I've been getting complaints of duplicate mails received _only_ from some ISPs. The cause is, after the sender sends ".", my program replies with "250 QUEUED" line but the sender doesn't seem to receive it. The SMTP connection then hangs for a while before disconnected. The sender would then retry the mail delivery several times. The result: users receive duplicates for their mails (sometimes only 5, sometimes up to 15-20!).

The senders seem to be Windows machine (they either use Communigate Pro or MDaemon). I had a client trying this on an Win98 machine and NT4 machine with the telnet program, and surely after he sends ".", my program replies with a 250 status line (which is shown in the debug log) but he doesn't receive it. Several other lines were sent from him, and my program replies as it should, but again no further response is received at his end.

I haven't been able to reproduce this in our office, using Win2K, Win98 SE, as well as Linux. The machines are behind a Linux firewall.

There seems to be no problem with other clients. But qmail-smtpd doesn't have this problem.

Any hints?

Replies are listed 'Best First'.
Re: Socket weirdness
by dave_the_m (Monsignor) on May 07, 2004 at 18:02 UTC
    I'm not familiar with Net::SMTP::Server, but if its possible, make sure, before the child process exits, that the socket is closed, and that the return value of close() is checked for success. Otherwise there's no guarantee that the sender has recieved and acknowleged the final reply packet.
Re: Socket weirdness
by sgifford (Prior) on May 07, 2004 at 18:46 UTC

    If you're already using qmail-smtpd and friends, set up recordio to record all input and output to your mail server. Then you should be able to identify problem sessions, and see what went on.

    Alternatively, use a packet sniffer like tcpdump to monitor all traffic with the hosts that are having problems, then review the logs after a problem happens.

    A few random thoughts...One problem that seems to particularly plague Windows mail clients is not sending a proper CR/LF pair at the end of each SMTP line. Another possibility is that you're not flushing your socket's output buffer and don't have AutoFlush turned on for it. I had a problem with an SMTP wrapper for qmail-smtpd that turned out to be an improper PIPELINING implementation on my end. All of those are easy to test, and make sure that your system handles them properly.

    If none of that helps, well, post some code. :)

      Thanks for the ideas. I haven't resorted to tcpdump, but I do log all lines received from the clients and all lines sent from the program.

      The IO::Socket docs says autoflush is turned on by default since 1.18. I've just checked and it's indeed on.

      I don't implement EHLO & PIPELINING (yet).

      The core of problem is, my program has correctly received all the DATA as well as the ".", and has processed the transaction and sent the 250 line, but these clients don't seem to get it. Haven't checked the return value of $sock->close() as the other poster noted, will do that. I'll also try doing a manual $sock->flush().

        And you're sending "250 ...\r\n", right? Not sending a proper SMTP line ending could confuse some clients and not others.

        Also, is it possible the clients are timing out, because you're taking a very long time to process the message?

        I find that using something like snoop/tcpdump or truss/strace is a great way of getting a reality check on what my program is really doing, as compared to what I think it's doing. :)

Re: Socket weirdness
by traveler (Parson) on May 07, 2004 at 17:34 UTC
    Any chance the rogue clients are behind outbound mail proxies (e.g. spam or virus filters)? It could be that those outbound filters are the issue.

    --traveler

      After some more testing, this seems to be the problem.

      If I print "250 QUEUED - 123"

      or "250 QUEUED(250) - 1084263071 qp 4221"

      then everything works, but if I send:

      "250 QUEUED(250) - 8c3d931926ca4e8a9dfea84f06dbdc1a"

      then the client won't receive the above 250 response line. The 32 hexacharacter part is a GUID which I produce randomly using 16 bytes retrieved everytime from /dev/urandom. Now why would a proxy or a firewall regard a random 32 hexadecimal character as suspicious and block/filter it?

        There may be a length limit. RFC2822 says the limit should be 998 chars, but I dunno about the actual software. Of course, 2822 does not really apply. The issue is 821. Check RFC821 appendix E for a discussion of replies...
Re: Socket weirdness
by dgaramond2 (Monk) on May 07, 2004 at 17:26 UTC
    Forgot to add: this is Perl 5.8.1 running Redhat 7.3 with the latest kernel from fedoralegacy.org. Net::SMTP::Server 1.1, IO::Socket::INET 1.27, IO::Socket 1.28, IO::Handle 1.23.