in reply to HTML::TokeParser Select List into Array

The thing that distinguishes wsfp's code from yours (apart from the fact that he uses the "Simple" version of TokeParser, which is irrelevant), is this: when it detects the start tag for the "select" block in the html data, it sets a state variable and continues to parse through the data. While that state variable is set, subsequent tags and text returned by the parser are stored as components of the "select" block; once the end tag for the "select" block is detected, the state variable is reset to false (the capture is done).

In the OP, the code seems to assume (falsely) that once the "select" start tag is seen, this event (and the content returned by the parser) encompasses the whole block. It doesn't. It's just the start-tag, and you have to keep reading until you see the corresponding end tag in order to obtain the whole content of that html block.

(If you were trying to extract a tag that was nestable, like "ul" or "table", you'd probably have to maintain a stack, to keep track of nesting level. This probably won't come up for "select". Or you could take a different approach entirely, with something like HTML::Treebuilder, which I have never used, so I'm not well informed about it's suitability here.)

  • Comment on Re: HTML::TokeParser Select List into Array