RandomWalk has asked for the wisdom of the Perl Monks concerning the following question:
I am feeding a Yahoo directory page to HTML::LinkExtor. This program extracts
as expected. OTOH, it truncates <a href=http://rds.yahoo.com/S=10341:D1/CS=10341/SS=53744154/SIG=112eblhep/*http%3A//www.beltbuckleshop.com/> to "http://www.beltbuckleshop.com", even substituting ":" for "%3A"!<a href="http://rds.yahoo.com/S=10341:D2/CS=10341/SS=53744154/*http:// +www.beltbuckleshop.com/">
I'm following "Google Hack #44" slavishly.
I *think* this is the nub of the problem, so I've not included more info. Anyone have experience with this module? Could it be the lack of quotation marks about the "href" value?
Thanks.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: HTML::LinkExtor idiosyncracy
by ikegami (Patriarch) on Apr 22, 2005 at 23:46 UTC | |
|
Re: HTML::LinkExtor idiosyncracy
by tlm (Prior) on Apr 22, 2005 at 23:56 UTC | |
|
Re: HTML::LinkExtor idiosyncracy
by eibwen (Friar) on Apr 23, 2005 at 13:12 UTC |