using the web scraping proxy

coder57 has asked for the wisdom of the Perl Monks concerning the following question:

can someone calrify exactly how the web scraping proxy is used

I am looking at it's readme file and it says

Usage:

wsp.pl [-v] [-a] [-p proxy]

and suggests the proxy does not contain http://

I assume web scraping proxy saves the content it is scraping to a file, does it do this automatically, or do i have to suggest the file it is to be saved, I have tried several combinations each time it says bad command or filename, and proceeds to (what I assume is web scraping proxy running, with either listening on port 5364... open for connections yet nothing actually shows on the screen that web scraping proxy is doing anything), I have also tried

perl wsp.pl -v > filetosaveto -p proxy proxyport

only to be met with the same error, sometimes ignoring the port I specified and defaulting to 5364. wsp.pl is in a subdirectory wspv2 in c:/perl/bin. I am a little stumped on where to go next, as far I as I am aware all the required modules from the readme file are installed

Edited by planetscape - added rudimentary formatting

( keep:0 edit:16 reap:0 )

Comment on using the web scraping proxy Select or Download Code

Replies are listed 'Best First'.
Re: using the web scraping proxy by andyford (Curate) on Nov 02, 2006 at 20:08 UTC
I googled for wsp.pl and found a readme that doesn't mention a "-v" flag. Have you tried it without the "-v"? Next, you should only use the "-p" only if you normally need another proxy to get to the websites you want to visit. One way to simplify your problem might be to ignore the "-p proxy" bit for now and just try to goto internal websites that don't require a proxy. That should make it easier to learn to use wsp.pl. Just out of curiosity, does this wsp.pl belong to a package or perl module or something? This looks real bad: `wsp.pl -v > filetosaveto -p proxy proxyport` [download] wsp.pl does need redirection to do what you want, but it should be more like this: `wsp.pl -v -p xyx 8080 > filetosaveto` [download] Update: Added redirection hint. andyford or non-Perl: Andy Ford	[reply] [d/l] [select]
Re^2: using the web scraping proxy by coder57 (Novice) on Nov 05, 2006 at 22:01 UTC
I did try it as: wsp.pl -v -p proxy proxyport > filetosaveto and wsp.pl -v -p proxy:proxyport > filetosaveto each time it simply said bad filename (ofcourse I did precede all those comands with perl ...), I used firefox as the proxy	[reply]
Re^3: using the web scraping proxy by andyford (Curate) on Nov 06, 2006 at 10:59 UTC
Did wsp.pl work without the ">" redirection? Not sure what you mean by "use firefox as the proxy". Can you elaborate? Do you mean that you set up firefox to use the computer running wsp.pl as the proxy? The "bad filename" problem is probably a file permission issue or something similar. Are you in a directory where you can create files? Try "ls > filetosaveto" to verify that. andyford or non-Perl: Andy Ford	[reply]
Re^4: using the web scraping proxy by coder57 (Novice) on Nov 06, 2006 at 23:27 UTC
Re^5: using the web scraping proxy by andyford (Curate) on Nov 07, 2006 at 10:40 UTC
Some notes below your chosen depth have not been shown here