in reply to RFC: URI::URL::Detail
I wanted to propose HTML::LinkExtract as a name, but then I found HTML::LinkExtor and HTML::LinkExtractor - maybe you can build it on top of one of those modules, or maybe extend them, or maybe they already do what you want?
"Clean" a URL so it can be used as a string ( in say Regular expressions or MySql insert statements ).
I don't think you need that. For regular expressions you just use /\Q$url\E/ or quotemeta, and for SQL inserts you should use placeholders anyway, no need to escape or clean anything.
HTML::TreeBuilder - Overkill?
It is never overkill to use a proper HTML parser for such a task. I don't know if that's the best for this task, but it should certainly work.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: RFC: URI::URL::Detail
by tmharish (Friar) on Aug 07, 2009 at 14:09 UTC | |
by moritz (Cardinal) on Aug 07, 2009 at 16:04 UTC |