mldvx4 has asked for the wisdom of the Perl Monks concerning the following question:
I'm trying to extract the values of specific attributes from various HTML elements using XPaths and HTML::TreeBuilder::XPath. Say I have an anchor, <a href="foobar.html">One Link</a>, and I would like to extract the value of the attribute "href" from it. That would be "foobar.html". Or if I have meta data, <meta name="description" content="foobar" />, then I would like to find the value of the attribute "content", which is "foobar", and where the attribute "name" has the value "description". I think I have the right XPath, as it works in other tools, but instead of giving me the value of the attribute "content" it gives me this error:
Can't locate object method "as_text" via package "HTML::TreeBuilder::XPath::Attribute" at ./x1.pl line 15.
What have I missed in the code below and how to tweak it?
#!/usr/bin/perl use HTML::TreeBuilder::XPath; use strict; use warnings; my $root = HTML::TreeBuilder::XPath->new; $root->parse_file(\*DATA); $root->eof(); for my $d ($root->findnodes('//html/head/meta[@name="description"]/@co +ntent')) { print qq(D=\n); print $d->as_text; } $root->delete; exit(0); __DATA__ <html> <head> <meta name="description" content="foobar" /> </head> <body> <h1>FOO</h1> <p>Bar</p> </body> </html>
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: HTML::TreeBuilder::XPath finding attribute values
by Corion (Patriarch) on Aug 05, 2019 at 18:00 UTC | |
|
Re: HTML::TreeBuilder::XPath finding attribute values
by tangent (Parson) on Aug 06, 2019 at 14:57 UTC | |
by mldvx4 (Hermit) on Aug 11, 2019 at 12:32 UTC |