in reply to Reaped: Re^2: Extracting stylesheet links or url from HTML Page
in thread Extracting stylesheet links or url from HTML Page
XML::LibXML is going to be the fastest; if you'll be in IO a bit, you'll have to benchmark to see how much faster. XML::Twig is the most Perly and probably the easiest and most flexible to hack for most Perl hackers. HTML::TokeParser::Simple or HTML::TokeParser will be the most reliable since certain HTML files will simply be too invalid for the others to handle.
A serious application that really needed the speed...? I'd try:
Read the Pod for the modules. They will all take different kinds of arguments: files, file handles, strings.
(update: fixed typo.)
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: Extracting stylesheet links or url from HTML Page
by mr_p (Scribe) on Jun 24, 2010 at 17:14 UTC |