Re^3: Extracting stylesheet links or url from HTML Page

XML::LibXML is going to be the fastest; if you'll be in IO a bit, you'll have to benchmark to see how much faster. XML::Twig is the most Perly and probably the easiest and most flexible to hack for most Perl hackers. HTML::TokeParser::Simple or HTML::TokeParser will be the most reliable since certain HTML files will simply be too invalid for the others to handle.

A serious application that really needed the speed...? I'd try:

XML::LibXML
- Worked? Next.
- Failed to parse? Use HTML::TokeParser. Next.

Read the Pod for the modules. They will all take different kinds of arguments: files, file handles, strings.

(update: fixed typo.)

Comment on Re^3: Extracting stylesheet links or url from HTML Page

Replies are listed 'Best First'.
Re^4: Extracting stylesheet links or url from HTML Page by mr_p (Scribe) on Jun 24, 2010 at 17:14 UTC
Thanks for u'r answer. I did benchmark and it was LibXML that was the faster one.	[reply]