chargrill has asked for the wisdom of the Perl Monks concerning the following question:
Greetings:
I've done some searching, but the search keywords I've thought of thus far haven't really given me much insight. I haven't written any code, nor am I looking for anyone to do so - instead, I'm asking for advice on an approach.
The situation:
What I'm looking for is an idea of how to sift through what's likely to be about 300,000 email messages at a time (which equates to roughly a GB or two of text data), and look for messages that were originally intended to go to these 15,000 email addresses, figure out what the bounce message it is, and handle it appropriately. Or figure out why we're not handling it appropriately. Of course, handling it appropriately is either removing it from our database, marking it as a "soft bounce" (mailbox is full, etc), ignoring it, adding a "does not accept HTML email" flag to our database, etc. But that part is not what I need help with :-)
So I'm looking for a memory efficient algorithm/approach (that hopefully wouldn't be too slow, but somwhat slow speed I can live with) for handling this - I honestly can't think of a good way to do this other than writing a thin wrapper around 15k grep's, or some other resource intensive brute-force method.
Any and all input welcomed!
s**lil*; $*=join'',sort split q**; s;.*;grr; &&s+(.(.)).+$2$1+; $; = qq-$_-;s,.*,ahc,;$,.=chop for split q,,,reverse;print for($,,$;,$*,$/)
|
|---|