jira0004 has asked for the wisdom of the Perl Monks concerning the following question:

Hi all,

Okay, here is the situation:

I am a developer and I can post data files on a production server through a staging process. I don't have access to the directory where the files are posted via FTP, so I can't just log onto the production server via FTP and enter the command ls . However, I have posted a number of data files in the given directory and in sub-directories under the given directory so that now I don't know what files I've posted.

I want to write a Perl script maybe using the LWP module that allows me to give the directory part of the URL and generates a list of all of the files available via that directory.

This program would work as follows:

Command:

prompt> perl http_ls.pl http://www.moonpie.com/

Output:

The following files exist under the URL http://www.moonpie.com/ :

frodo.html
README.txt
gandalf.html
config_info.txt
bilbo.html
scream.wav
my_photo.jpg

The Perl script would use some thing like LWP to get a listing of all of the files available given the base URL, http://www.moonpie.com/ , in the above example.

Note that http://www.moonpie.com/ is not my URL so do not visit it as I don't know what it is, I just used it as an example.

I don't need some one to write the code for me (although I suppose you could if you wanted to). I do need some one to let me know if this is possible and if so, which modules/protocols do I use to get a listing of files via HTTP.

I have checked through the LWP documentation but it doesn't seem that LWP sopports an ls type operation. Maybe this just isn't supported via HTTP (which case I am just out of luck). Anyone know if the operation that I am trying to do is supported via HTTP? And if so, how I do it?

Any pointers, advice or insight any one has would be greatly appretiated.

Thanks, Regards,

Peter Jirak

jira0004@yahoo.com

  • Comment on Is there a method in LWP that lets me get a listing of all files available in a URL directory?

Replies are listed 'Best First'.
Re: Is there a method in LWP that lets me get a listing of all files available in a URL directory?
by jdtoronto (Prior) on Jul 05, 2006 at 19:03 UTC
    If the server will allow directory listings, yes. If not, no. It has nothing to do with LWP at all. Something like LWP::UserAgent makes the request and the server returns whatever it will.

    jdtoronto

Re: Is there a method in LWP that lets me get a listing of all files available in a URL directory?
by gellyfish (Monsignor) on Jul 05, 2006 at 19:04 UTC

    Yes you are correct in your guess that HTTP has no 'ls' like operation, a web server may give you a listing of the files in a 'directory' in response to a GET request if it is configured to allow this and there is no default page configured.

    /J\

Re: Is there a method in LWP that lets me get a listing of all files available in a URL directory?
by sgifford (Prior) on Jul 05, 2006 at 19:06 UTC
    HTTP doesn't support this on its own. If there are links to all of the files, you could use "spidering" techniques to explore all of the links; automatically-generated index pages may make this easier. Another option is WebDAV, designed for authoring Web pages, which probably has a way to list files. A third option is writing a small script to run on the server and provide the file listings for you. If the list of files should be secret at all, you'd want to make sure this was written securely and used some kind of authentication.
      I think that sgifford's WebDAV suggestion is right on. So, check out HTTP::DAV, Net::DAV::Server. I would also recommend a great little program called cadaver. It's a commandline WebDAV client that supports file upload, download, on-screen display, namespace operations, collection creation and deletion, and locking operations. Better yet, there's always neon---an HTTP and WebDAV client library with a C interface. cadaver WebDAV client library
Re: Is there a method in LWP that lets me get a listing of all files available in a URL directory?
by saberworks (Curate) on Jul 05, 2006 at 19:06 UTC
    Even if the web server will give you a list of files, different servers display the files in different ways, and in fact, it's just returning an HTML page. So, even if these options were enabled on the servers you were interested in, you'd have to parse the HTML page to pull out the filenames. I also know for a fact that apache cuts off the file names if they are too long. You're better off logging in via FTP or providing a url on the server which returns the list of files in a manner you can parse.
      I also know for a fact that apache cuts off the file names if they are too long.
      This shouldn't be much of a problem, because only the displayed part of the link is cut, not the href.

        You are right that only the displayed name is truncated, not the link, but if you want to avoid the truncation that is easy enough to do too.

        Steve
        --
Re: Is there a method in LWP that lets me get a listing of all files available in a URL directory?
by tinita (Parson) on Jul 06, 2006 at 08:12 UTC
    Note that http://www.moonpie.com/ is not my URL so do not visit it as I don't know what it is, I just used it as an example.
    Have you ever heard of example.org (and the RFC 2606)?