Hi monks.
I was wondering if you could help me today. I'm trying to create an email parser that essentially reads an email and breaks it apart into sets of usable chunks. Neverminding the email portion of it now, I'm trying to get the regex/breaking the data apart logic down first.
The email will look something like..
[title] blah blah blah [title]
[tags] tag tag blah blah [tags]
[message]
blah blah blah
red riding hood
runs away from the scary wolf
blah
blah
[message]
#############################################
[title] another one [title]
[tags] more tags [tags]
[message]
another
message
here
[message]
#############################################
[title] last one [title]
[tags] last one [tags]
[message]
more fun here
[message]
I'm using the # as the separater and so far I can successfully split the one email into my usable pieces.
my @split_msg = split(/#############################################/, $message);
That part is good. Now I'm having more problems breaking each thing down to TITLE, TAG and MESSAGE.
foreach my $email (@split_msg)
{
my ($title, $tags, $msg);
$email =~ m/\[title\](.+)\[title\]/i;
$title = $1;
$email =~ m/\[message\](.*)\[message\]/i;
$msg = $2;
print "$title\n\n$msg\n\n";
}
The above code keeps saying MSG is uninitialized. It was originall (.+) but I tried (.*) to see if that would make any difference. Each portion (tag, title, message) can contain new lines and I have to capture whatever is in between each of them.
Can you help me figure this one out?
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.