in reply to Re^2: Building a Spidering Application
in thread Building a Spidering Application
You don't need URI::ImpliedBase. WWW::Mechanize::Link objects that Mech uses/returns have a method, url_abs, to cover this. Of course then it's up to the spider to decide if query params are relevant or duplicates or no-ops and, in the hacky world of HTML4.9, if fragments are meaningful (but only JS aware Mech would be able to care here).
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: Building a Spidering Application
by pemungkah (Priest) on Jul 09, 2012 at 16:17 UTC | |
by Your Mother (Archbishop) on Jul 09, 2012 at 18:22 UTC |