in reply to Simple link extraction tool
How am I suppose to use your program?
This clobbers any existing listurls.txt, gives me two copies of the data and puts a useless status message in preferedname.txt:
linkextractor http://www.blah.com/ > preferedname.txt
This clobbers any existing listurls.txt and puts a useless status message in urls.txt:
linkextractor http://www.blah.com/ > preferedname.txt & del listurls.t +xt
This clobbers any existing listurls.txt and loses any error status message:
linkextractor http://www.example.com/ > nul & move listurls.txt prefer +edname.txt
Suggestions:
Suggestions applied:
use strict; use warnings; use List::MoreUtils qw( uniq ); use WWW::Mechanize qw( ); # usage: linkextractor http://www.blah.com/ > listurls.txt my ($url) = @ARGV; my $mech = WWW::Mechanize->new(); my $response = $mech->get($url); $response->is_success() or die($response->status_line() . "\n"); print map { "$_\n" } sort { $a cmp $b } uniq map { $_->url_abs() } $mech->links();
Update: At first, I didn't realize it was outputing to STDOUT in addition to listurls.txt. I recommended that the output should be sent to STDOUT. This is a rewrite.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Simple link extraction tool
by Scott7477 (Chaplain) on Jan 02, 2007 at 23:38 UTC | |
by jdporter (Paladin) on Jan 03, 2007 at 05:13 UTC |