in reply to Infinite loop prevention for spider

I saw a talk by the author of String::Trigram, and he mentioned that he used his module for a similar problem, determining whether a webpage had changed or not. If you tune your similarity threshold good enough, this could be another measure for "page similarity" respectively "These two urls are the same page".

perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ; # The $d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider ($c = $d->accept())->get_request(); $c->send_response( new #in the HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' # web