przemek88 has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

I was searching for answer but didn't find it. I have a Perl script that is recursively mirroring pages of website, depending on user conditions (like depth, file size, extension, path, etc.). I need this script to be working on a server side, with a html form, where user could just fill appropriate fields and submit them. It's easy to run such script on localhost, with apache, because I can just type path, so it saves website on my local computer (because server and client are using same machine).

But there is a problem. If I uploaded this script with form on a server, it wouldn't work, because my specified local path would be treated as a server path, so all files would be stored on server. I need some solution, so the script would save webpages on my local disk. Either save files directly, or store them on my server and then save on user's local disk (bad solution).

Thank you.

Replies are listed 'Best First'.
Re: cgi saving files on client local disk
by jethro (Monsignor) on Jan 25, 2011 at 23:25 UTC

    Basically you have to provide a link to the search form that returns your web pages (probably as a tar or zip file). Google for "MIME" and look for an appropriate reponse, for example "application/zip" for a zip-file should work. (See http://de.selfhtml.org/diverses/mimetypen.htm for a nice list, german but easily comprehensible).

    Behind that link can't be a static file, because the web server still has to download those pages, which takes time. A static link would just return file not found. So instead you need a CGI script behind that link that delivers the result of the mirroring and zipping as soon as it is available, with the appropriate MIME-type

    I don't think you can deliver the web pages separately without much complex scripting and you also can't avoid that the user is asked where he wants to save the file, except if he has configured his browser to just save files of this type to disk without question. That is outside your influence

Re: cgi saving files on client local disk
by ikegami (Patriarch) on Jan 25, 2011 at 23:05 UTC

    Neither the server nor your CGI application have access to the client except in their ability to return an HTTP response to it. Furthermore, an HTTP response can only contain one document. (I'm ignoring multi-part documents and archives since the browser would still take those to be one document.) For example, if an HTML page is returned, nothing else can be returned.

    Let's look at the simple case first: There is no need to return an HTML page. Then you can just return the file you want to save as the body of the response, and tell the browser that it should be saved instead of displayed by using the appropriate Content-Disposition header. The user will be prompted where to save the document. This is inescapable because web sites must not have arbitrary access to your hard drive for obvious security reasons.

    The next case in ascending complexity is one HTML page to display and one document to save. You can do that by returning the HTML and include inside of it ab HTTP redirect back to your script (presumably with extra parameters). This second call to your script is just the simple case above. You can look at the response you get from the download page of your favorite application for an example.

    The most complex case is when there are multiple documents to save in addition to the HTML page. You'd have to use frames or JavaScript to initiate multiple call back to your CGI script (again, with different parameters to indicate which file to save). This won't be very usable since the user will have to specify the location of each file. You'd be much better off returning an archive or an installer.