No need to wonder anymore, yes, HTML::Parser will help you accomplish what you're doing.
DESCRIPTION
    Objects of the "HTML::Parser" class will recognize markup and separate
    it from plain text (alias data content) in HTML documents. As different
    kinds of markup and text are recognized, the corresponding event
    handlers are invoked.

    "HTML::Parser" in not a generic SGML parser. We have tried to make it
    able to deal with the HTML that is actually "out there", and it normally
    parses as closely as possible to the way the popular web browsers do it
    instead of strictly following one of the many HTML specifications from
    W3C. Where there is disagreement there is often an option that you can
    enable to get the official behaviour.

    The document to be parsed may be supplied in arbitrary chunks. This
    makes on-the-fly parsing as documents are received from the network
    possible.

    If event driven parsing does not feel right for your application, you
    might want to use "HTML::PullParser". It is a "HTML::Parser" subclass
    that allows a more conventional program structure.
If you have no idea how I got that description, please read this friendly guide on perl documentation and resources.

There is a better way, and it's called HTML::TokeParser (see Tutorials for a tutorial).

____________________________________________________
** The Third rule of perl club is a statement of fact: pod is sexy.


In reply to Re: HTML::Parser?? by PodMaster
in thread HTML::Parser?? by bleekbob

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.