gnangia has asked for the wisdom of the Perl Monks concerning the following question:

Is it possible to send any other variable(s) to the "start" event handler of HTML::Parser other than the standard $self, $tag, $attr etc. ?
  • Comment on Passing other variables to start handler in HTML::Parser

Replies are listed 'Best First'.
(Javascript::SpiderMonkey)Re: Passing other variables to start handler in HTML::Parser
by PodMaster (Abbot) on Nov 15, 2002 at 18:25 UTC
    No.
    Since $self is a blessed hashref, you can always use it, but be careful (a safe bet would be $$self{"\0_my_extra_args"} )
    What exactly are you attempting to do?
    Maybe you'd be better off using HTML::TokeParser::Simple.

    update: Ah, I see. Are you going to be handling forms? I still say go with HTML::TokeParser::Simple, or possibly my HTML::LinkExtractor ;)

    update: Look, this is interesting ;)

    use JavaScript::SpiderMonkey; my $jS = JavaScript::SpiderMonkey->new(); # Initialize Runtime/Context $jS->init(); # create a new object, and set a method my $document = $jS->object_by_path("document"); $jS->function_set("write", \&Write , $document); $jS->function_set("writeln", \&WriteLn , $document); $jS->property_by_path("document.location.href"); # Execute some code my $rc = $jS->eval(q[ document.location.href = append("http://", "www.perlmonks.org"); document.write("URL is ", document.location.href); document.writeln("\nURL is ", document.location.href); function append(first, second) { return first + second; } ]); # Get the value of a property set in JS my $url = $jS->property_get("document.location.href"); print "the $url is\n"; $jS->destroy(); sub Write { print for @_; } sub WriteLn { print for @_; print "\n"; } __END__ URL is http://www.perlmonks.org URL is http://www.perlmonks.org the http://www.perlmonks.org is

    ____________________________________________________
    ** The Third rule of perl club is a statement of fact: pod is sexy.

      I am actually attempting to write a generic web client using perl with response time measurement. Basically after I do an http get on the url (which retrieves the source (text)), I parse out all the image urls and then do a separate http get within the start handler. The problem is that I need to pass some of my own variables for error notification in case the image file is missing or the status response is other than "is_success".
        You don't need to pass them, they can be coded into your start handler. It's part of your logic.
Re: Passing other variables to start handler in HTML::Parser
by pg (Canon) on Nov 15, 2002 at 19:04 UTC
    HTML::Parser is a generic module, and it now passes all what it can get from html file to you, it is good enough already, the rest would rely on you. The door is always open, if you want to sub class HTML::Parser, but don't think it is a good idea here.

    use HTML::Parser; sub handle_img { my $attr = shift; if ($attr->{"src"} eq "perlmonks") { print "Oh, a monk\n"; } else { print "Who are you?\n"; } } sub start { my ($self, $tagname, $attr) = @_; if (lc($tagname) eq "img") { handle_img($attr); } } $p = HTML::Parser->new(start_h => [\&start, "self, tagname, attr"]); $p->parse_file("a.html");
Re: Passing other variables to start handler in HTML::Parser
by Anonymous Monk on Nov 15, 2002 at 18:34 UTC
    Yes, you can only pass 5 parms: self, tag, attr, attr_seq, and text, nothing else. However that's all you will get from a valid html file, so it is good enough. See, it basicly passed everything it reads from your html file.