Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Problem with HTML::Parser

by Skeeve (Parson)
on Jan 30, 2007 at 15:31 UTC ( [id://597385]=perlquestion: print w/replies, xml ) Need Help??

Skeeve has asked for the wisdom of the Perl Monks concerning the following question:

I have a problem using HTML::Parser. Maybe I'm missing something plain obviouse, but I don't see it. I have to say that it's the first time ever I use HTML::Parser.

What I thought I've done in the script shown below is:

  1. invoked for every html file found the parser
  2. dumped for every embed tag encountered all attributes

But what I really get is, for each embed tag in every file found, just one attribute.

Can someone please hit my head aginst the parts where I made a mistake?

#!/usr/bin/perl use strict; use warnings; use HTML::Parser; use File::Find; use Data::Dumper; my $hp= HTML::Parser->new( api_version => 3, start_h => [ \&start, 'tag,attr' ], # This should call, for every start tag the start subroutine # passign the tagname and a reference to the attribute hash ); find( sub { return unless /\.html?$/; find_embed( $_ ); # in find_embed I will invoke the parser }, @ARGV) if scalar(@ARGV); sub find_embed { my($filename)= @_; $hp->parse_file( $filename ); # Now the parser is started } sub start { my($tag, $attr)= @_; return unless $tag eq 'embed'; # ignore every tag but embed print Dumper($attr); # Dump the attributes. }

s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
+.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e

Replies are listed 'Best First'.
Re: Problem with HTML::Parser
by Skeeve (Parson) on Jan 30, 2007 at 15:50 UTC

    Sorry! I'm plain stupid!

    It works perfectly! My mistake was the invocation:

    ./find_embed /some/dir/ectory | grep embed

    ARGH! And the only attribute that I got contained the charactersequence "embed"

    BIG SORRY!

    Thanks to andye for hitting me with a tunafish in the chatterbox. I needed that!


    s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
    +.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e
      Thanks to andye for hitting me with a tunafish in the chatterbox.
      Heh, that happens when you read too much Matt Ruff :-))


      holli, /regexed monk/

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://597385]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (4)
As of 2024-04-25 16:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found