I'm using Mail::Pop3Client to retrieve messages from an exchange account with an ssl connection. I've run into a few problems when getting the email body using Body():
1. html emails contain more than just html tags: they include some header info and miscellaneous information that varies depending on which carrier the email is from, text is even duplicated and special characters are converted differently. I've tried using regular expressions to clean it up, but it's getting hairy to account for all the discrepancies, and I want to make sure this will work from ANY carrier. I can make all this work, however...
2. Emails originated from Cox (there may be more providers, this is just the one that i've discovered so far) get translated to gibberish. eg: original plain text mail says 'This is a test...', but Body() returns 'VGhpcyBpcyBhIHRlc3QuLi4NCg==' -- that's it -- and the html email returns a basic header followed by lines and lines of seemingly random characters. Viewing the email through outlook looks normal.
The ultimate goal is to take the body of the message in plain simple text and send a text message to a cell phone. Any extra/confusing characters are unacceptable. Is there a better module out there that I should be using instead or is there a way to make this work?
for (my $i=1; $i<=$messages; $i++){ foreach( $pop->Head($i)){ if($_=~/From:[^<>]*\<(.*)\>/){ print "From: $1 "; $emails[$i]->{'from'}=$1; } if($_=~/Subject:(.*)/){ #print "To: $1 "; $emails[$i]->{'subject'}=$1; } } my $body=$pop->Body($i); $body=~s/\n/ /g; $body=~s/\r/ /g; $body=~s/^.*\<body[^<>]*\>(.*)/$1/; $body=~s/(.*)\<\/body[^<>]*\>.*$/$1/; while($body=~/[<>]/){ $body=~s/\<[^<>]*\>(.*)/$1/; } $body=~s/’/\'/g; $body=~s/'/\'/g; $body=~s/\=92/\'/g; $body=~s/\=A0/ /g; $body=~s/\=\s //g; $body=~s/\ / /g; $body=~s/\s{2,}/ /g; $emails[$i]->{'body'}=$body; #$pop->Delete($i); print "Message: ".$body."\n\n"; } $pop->Close();
In reply to Mail::Pop3Client - want to get consistent body text by ksublondie
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |