Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re: Link Parser, something to be desired?

by planetscape (Chancellor)
on May 29, 2009 at 23:43 UTC ( [id://766986]=note: print w/replies, xml ) Need Help??


in reply to Link Parser, something to be desired?

Hi! Welcome back!

First, don't use regexen to parse HTML. There are many nodes here on PM that will tell you why that's a Bad Idea™.

Instead, use something like WWW::Mechanize find_all_links() or HTML::TreeBuilder look_down() to find your links.

Second, had you done Google's advanced search against PerlMonks for "html remove link", you'd have found helpful nodes such as these:

IMHO, Re: Regex: Strip <script> tags? looks quite promising. ;-)

Good luck!

HTH,

planetscape

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://766986]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (2)
As of 2024-04-20 04:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found