Hi,
I speak from personal experience when I say if I were you I'd avoid the regex's for this problem entirely and use some kind of HTML extraction module instead - makes it a lot easier and you don't need to worry about multi-line elements.
Try looking up
HTML::Parser
,
HTML::TokeParser
,
HTML::TagFilter
,
HTML::TokeParser::Simple
or something similar
HTH,
Neil
In reply to
Re: Multi-Line Regex's
by
Nemp
in thread
Multi-Line Regex's
by
sch
Title:
Use:
<p> text here (a
p
aragraph) </p>
and:
<code> code here </code>
to format your post, it's "
PerlMonks-approved HTML
":
Posts are HTML formatted.
Put
<p> </p>
tags around your paragraphs. Put
<code> </code>
tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read
Where should I post X?
if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
How do I compose an effective node title?
How do I post a question effectively?
Markup in the Monastery
Posts may use any of the
Perl Monks Approved HTML tags
:
a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
For:
Use:
&
&
<
<
>
>
[
[
]
]
Link using PerlMonks shortcuts!
What shortcuts can I use for linking?
See
Writeup Formatting Tips
and other pages linked from there for more info.