Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re: Extracting paragraphs from html

by merlyn (Sage)
on Sep 11, 2005 at 16:50 UTC ( #491072=note: print w/replies, xml ) Need Help??


in reply to Extracting paragraphs from html

Use XML::LibXML in HTML-parsing mode, then use an XPath that looks for text() nodes that have a length greater than N.

-- Randal L. Schwartz, Perl hacker
Be sure to read my standard disclaimer if this is a reply.


update: See Locate large HTML paragraphs with XML::LibXML.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://491072]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (1)
As of 2023-01-28 03:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?