Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re: How do I extract text from an HTML page?

by CountZero (Bishop)
on Aug 03, 2003 at 16:06 UTC ( [id://280447]=note: print w/replies, xml ) Need Help??


in reply to How do I extract text from an HTML page?

Well whatever you do, the only way not to go is to regex the HTML-code yourself. This will only work for the most simple and regular of HTML-code and will break before you know it.

Another approach is to go to the source of your data in the first web-page. Assuming that this is based upon some database, can't you go directly to that database and query the data from there?

CountZero

"If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://280447]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (5)
As of 2024-03-28 18:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found