Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

remote directory structure

by poprishchin (Sexton)
on Mar 05, 2002 at 23:44 UTC ( [id://149556]=perlquestion: print w/replies, xml ) Need Help??

poprishchin has asked for the wisdom of the Perl Monks concerning the following question:

I need to get a remote server's directory structure for a script I am working on - is this possible? The user would supply a url (http://www.w3.org, for instance) and I need a directory structure under that returned (index.html, docs/, html/, images/, whatever...) Is this possible without having a login for a server? Thanks!

Replies are listed 'Best First'.
Re: remote directory structure
by theorbtwo (Prior) on Mar 06, 2002 at 08:09 UTC

    Sure. Get the HTML with lwp, read the hrefs and srcs out of it with HTML::Parser (or friends), and then chop off the part after the last /. Lather, recruse, repeat. This, of course, wont catch directories that dont have anything in them used, but exist anyway, and directories that have only their index.html linked to with the implicit directory-name-without-trailing-slash redirect.

    If you want somthing thats a little less of an ugly hack, hope you can use FTP to get the content.


    We are using here a powerful strategy of synthesis: wishful thinking. -- The Wizard Book

    PS -- sorry about the bad contractions, but the quote key on this keyboard is broken.

(redmist) Re: remote directory structure
by redmist (Deacon) on Mar 06, 2002 at 00:11 UTC

    As long as you are talking about the directory structure underneath the DocumentRoot, I believe it depends on the directory permissions on the server (if they are set a certain way, you can request the directory -- e.g. http://foo.com/images). Look into wget, LWP::(Simple|RobotUA) (haven't used either for what you want to do, and haven't looked into it much).

    redmist
    Purple Monkey Dishwasher
Re: remote directory structure
by perrin (Chancellor) on Mar 06, 2002 at 01:01 UTC
    No, this is not possible without a login on that server, or some other form of cooperation from the server.
Re: remote directory structure
by abaxaba (Hermit) on Mar 06, 2002 at 04:11 UTC
    Couple of different options: Expect.pm, with a raw telnet into the box. Not real secure for the remote user, IMHO.

    OR

    Write a cgi for the remote box that basically does this:
    print "Content-type:text/html\n\n"; opendir (D,".") || die "$!"; print join ("\n",(sort {$a cmp $b} grep !/^\.\.?$|$0/, readdir D)); close(D);
    Put the CGI on the remote machine, and just parse the HTTP response into an array to get the dirlisting. If this needs to be recursive on the remote machine, checkout File::Find.
Re: remote directory structure
by webadept (Pilgrim) on Mar 06, 2002 at 07:31 UTC
    Ha! :-) ummm, nope, I don't believe this would be possible without logging in... I suppose it could be made to be possible, but they might do nasty things to the sysadmin in charge who let something like that be possible. Pummel into unconsciousness with an organic carrot comes to mind.

    webadept

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://149556]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (6)
As of 2024-04-23 15:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found