coder57 has asked for the wisdom of the Perl Monks concerning the following question:

can someone calrify exactly how the web scraping proxy is used

I am looking at it's readme file and it says

Usage:

wsp.pl [-v] [-a] [-p proxy]

and suggests the proxy does not contain http://

I assume web scraping proxy saves the content it is scraping to a file, does it do this automatically, or do i have to suggest the file it is to be saved, I have tried several combinations each time it says bad command or filename, and proceeds to (what I assume is web scraping proxy running, with either listening on port 5364... open for connections yet nothing actually shows on the screen that web scraping proxy is doing anything), I have also tried

perl wsp.pl -v > filetosaveto -p proxy proxyport

only to be met with the same error, sometimes ignoring the port I specified and defaulting to 5364. wsp.pl is in a subdirectory wspv2 in c:/perl/bin. I am a little stumped on where to go next, as far I as I am aware all the required modules from the readme file are installed

Edited by planetscape - added rudimentary formatting

( keep:0 edit:16 reap:0 )

Replies are listed 'Best First'.
Re: using the web scraping proxy
by andyford (Curate) on Nov 02, 2006 at 20:08 UTC
    I googled for wsp.pl and found a readme that doesn't mention a "-v" flag. Have you tried it without the "-v"?

    Next, you should only use the "-p" only if you normally need another proxy to get to the websites you want to visit.
    One way to simplify your problem might be to ignore the "-p proxy" bit for now and just try to goto internal websites that don't require a proxy. That should make it easier to learn to use wsp.pl.

    Just out of curiosity, does this wsp.pl belong to a package or perl module or something?

    This looks real bad:

    wsp.pl -v > filetosaveto -p proxy proxyport
    wsp.pl does need redirection to do what you want, but it should be more like this:
    wsp.pl -v -p xyx 8080 > filetosaveto

    Update: Added redirection hint.

    andyford
    or non-Perl: Andy Ford

      I did try it as: wsp.pl -v -p proxy proxyport > filetosaveto and wsp.pl -v -p proxy:proxyport > filetosaveto each time it simply said bad filename (ofcourse I did precede all those comands with perl ...), I used firefox as the proxy

        Did wsp.pl work without the ">" redirection?
        Not sure what you mean by "use firefox as the proxy". Can you elaborate?
        Do you mean that you set up firefox to use the computer running wsp.pl as the proxy?
        The "bad filename" problem is probably a file permission issue or something similar. Are you in a directory where you can create files? Try "ls > filetosaveto" to verify that.

        andyford
        or non-Perl: Andy Ford