in reply to Re^2: HTML::TokeParser Problem
in thread HTML::TokeParser Problem

I think this is where one is supposed to use a '%HH' expression to safely invoke/evoke the "special" character (double quote in this case) without screwing up everything else:
... saloon.18%22 alloy...
The same might apply to the apostrophe that occurs later in the same string (replace it with "%27"), and since you'll be doing stuff in perl with this string, you'd better treat the dollar sign as well ("%24").

By any chance, has something already been done to this text, in terms of "decoding" uri escapes, before you get to the point in your script that throws the error? If so, maybe just postpone doing that sort of step until later in the script.

Update (oops): As tye points out in the following reply, I'm wrong -- it's not a URI-escape thing, it's an HTML Entity thing. So, my question should have been phrased "has something been done to decode HTML entity references (like ")?" If so, don't do that, or do it later.

Replies are listed 'Best First'.
Re^4: HTML::TokeParser Problem (entitties)
by tye (Sage) on Dec 17, 2004 at 06:51 UTC

    No, %XX is for URLs which this isn't. This is just HTML so use " in place of ".

    - tye        

Re^4: HTML::TokeParser Problem
by sirius98 (Acolyte) on Dec 17, 2004 at 10:08 UTC
    I'm parsing a web page that i have no control over the format of the content. Basically i get the webpage using the LWP::UserAgent module then i call the HTML::TokeParser on that web page. Then i call the while loop that is in my post. Im wondering if i can set what the TokeParser object that i create sees as ending the field so instead of it being an " i can set it to >

    Any Suggestions would be great.

Re^4: HTML::TokeParser Problem
by sirius98 (Acolyte) on Dec 18, 2004 at 05:35 UTC
    I think the only way im going to be able to get what i need out of this tag is to capture the whole tag contents. I was wondering if there was a way of specifying the entire contents of the tag instead of, for example just the value element. I have to use TokeParser because i need to specify a particular tag out of several similar tags differentiated only by name.