Important update: I forgot one of the most important points about this module. Like CGI::Safe, it's a drop-in replacement. You can use this in place of HTML::TokeParser and it's completely transparent. Then, just use the new methods where you feel the need to get the greater clarity. This simplifies migration to this module.

Update 2: crazyinsomniac is right: the AUTOLOAD has got to go. When the methods were simple, the AUTOLOAD sort of made sense. Now, however, I need to overload a couple of them and even pulling them out and refactoring the rest into an AUTOLOAD just doesn't have enough benefit to justify the obfuscation value. Darn it. If anyone else had written this, I would have been the first to point that out. How reluctant we are to admit that our children are ugly :)

After prompting from a couple of monks, I finally got off my duff and finished up the HTML::TokeParser::Easy module. This is basically an adaptor for HTML::TokeParser that makes the module easier to use (no more memorizing array indices). For example, with HTML::TokeParser, if you want to find out if a token returned from get_token() is a start token and a form token, you would do this:

if ( $token->[0] eq 'S' and $token->[1] eq 'form' ){...}

Now, you just do this:

if ( $token->is_start_tag( 'form' ) ){...}

Is a token a comment?

if ( $token->is_comment ){...}

That was originally $token->[0] eq 'C'.

Need the attributes of a given token?

my $attributes = $token->attr;

That code was $token->[3], or $token->[2], depending upon how you generated the token. Now, it's one standard method.

If this interests you and you use HTML::TokeParser, please download and test the distribution. I haven't written the tests yet, but I won't upload to the CPAN without them. I at least managed to get POD written up :)

Cheers,
Ovid

Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.


In reply to RFC: HTML::TokeParser::Easy by Ovid

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.