in reply to HTML::TreeBuilder:: identifing xpath-expression - first attempt
#!/usr/bin/perl use strict; use warnings; use HTML::TreeBuilder::XPath; use LWP::Simple; my $tree = HTML::TreeBuilder::XPath->new; $tree->parse_content(get 'http://www.educa.ch/dyn/79376.asp?id=1187'); $tree->findnodes(q{//tr[1]/td[2]}); print $tree->as_text, "\n";
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: HTML::TreeBuilder:: identifing xpath-expression - first attempt
by Anonymous Monk on Oct 17, 2010 at 12:31 UTC | |
| [reply] [d/l] |
by Perlbeginner1 (Scribe) on Oct 17, 2010 at 13:35 UTC | |
this is a great place for learning. I am so happy bout the answers -they show me this community is alive and so great - in helping and giving a helping hand. this is a great expericence! i will read all the answers later - since i have to leave the house at the moment! i come back later this day. meanwhle many many thanks for all! update Well - if i am able to identify the XPATH expressions for this site http://www.educa.ch/dyn/79376.asp?id=1187 then i am able to do the job! Note: if i can do it for one site -i am able to do it for more than 5000 - since i have to parse al of them..;-) Well - we see that there are three tasks. a. fetching the pages b. parsing them c. storing the results in a database for the first task we can use LWP-USERAGENT or MECHANIZE for the next tasks we can use HTML-Parser! For the third task we need some knowledge of PERL::DBI | [reply] |
by Perlbeginner1 (Scribe) on Oct 17, 2010 at 17:10 UTC | |
you refer to the page that explains and hepls finding xpaths. That is very very interesting! I am trying to learn something here. you use this link: http://www.perlmonks.org/?node_id=865792 It leads to this code!
Question: this above mentioned code helps to throw out the Paths of a (general) HTML-document!?!? At least you make usage here:
That is very very impressive. I try to understand this code - and your usage of your example -that you were refering to!
if i get you right - then i can use this script for many many cases - in order to get out the Xpaths!? Is this right look forwward to hear form you! I guess that i can learn alot! Plz help me here! | [reply] [d/l] [select] |
by Perlbeginner1 (Scribe) on Oct 17, 2010 at 17:29 UTC | |
you refer to the page that explains and hepls finding xpaths. That is very very interesting! I am trying to learn something here. you use this link: http://www.perlmonks.org/?node_id=865792 It leads to this code!
Question: this above mentioned code helps to throw out the Paths of a (general) HTML-document!?!? At least you make usage here:
That is very very impressive. I try to understand this code - and your usage of your example -that you were refering to!
if i get you right - then i can use this script for many many cases - in order to get out the Xpaths!? Is this right look forwward to hear form you! I guess that i can learn alot! Plz help me here! | [reply] [d/l] [select] |
by Anonymous Monk on Apr 02, 2011 at 15:11 UTC | |
No, not at all, I'm an asshole, I make shit up, I'm not here to help anyone, I just post stuff to tease, it never works :/ | [reply] |