From the docs:
$p->get_token This method will return the next token found in the HTML do +cument, or "undef" at the end of the document. The token is returned as an array ref +erence. The first element of the array will be a string denoting the type of this tok +en: "S" for start tag, "E" for end tag, "T" for text, "C" for comment, "D" for declara +tion, and "PI" for process instructions. The rest of the token array depend on the ty +pe like this: ["S", $tag, $attr, $attrseq, $text] ["E", $tag, $text] ["T", $text, $is_data] ["C", $text] ["D", $text] ["PI", $token0, $text]

It appears you're trying to read the href attribute from a token without attributes (like an end tag or a text). Also not all html tags actually have an href attribute, which should give warnings if you've enabled them.

How about (untested):

while (my $token = $stream -> get_token()){ if ($token->[0] eq 'S') { # start tag if (exists $token->[2]->{href}) { # tag has href attribute print "PDF link!\n" if $token -> [2] -> {'href'} =~ m/\.pdf/; } } }

updated:

Your code "works" without strict because using $something->{href}, where $something is a string will reference a global hash named $something, creating it if it doesn't exist yet (i.e. if $something eq 'blah', a global hash %blah will be created if it doesn't already exists). This can cause all kinds of mayhem and is a good reason to always use strict (see strict 'refs' in the strict documentation and symbolic links in perlref)

updated: moved doc links to cpan.org, perldoc.com is messing up again.


In reply to Re: use strict and TokeParser by Joost
in thread use strict and TokeParser by young_stu

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.