Avox has asked for the wisdom of the Perl Monks concerning the following question:

I've been working on uploading files via webmin. I've created a little module to upload a binary file using CGI.pm's upload function. Of course it works for small files, but as files get larger, it takes too long and the browser does not give any indication of the progress (not that I expected it to, but thats another question).

I have a couple of questions and am seeking advice.

Advice1: I basically have to be able to upload a 100MB file. I truthfully didn't really expect this solution to work as is, but had to start somewhere so I started using CGI.pm's upload. Does anyone have a better idea of a way to do this. Really the only requirement I have is that I need to implement it via Webmin (for those who don't know what this is, its a web based server administration tool written entirely in perl - www.webmin.com - check it out, its great!).

Q1: does the upload function of CGI.pm read the entire file into memory before continuing with the script? If so, is there a method of streaming it to a file on disk rather than slurping it all up first?

Q2: anyone know of any good examples of the CGI:upload_hook callback? I'm assuming it would be best to pop a window up using this call back and have the user refresh it every min or so.

Thanks everyone, while i'm not an avid poster on perlmonks, I certainly read and enjoy the posts and feeling of community!

Replies are listed 'Best First'.
Re: CGI.pm and Large File Transfers
by iburrell (Chaplain) on Mar 15, 2004 at 22:08 UTC
    CGI.pm always saves uploads to a temporary file. It gives the program a filehandle to the temporary file. It does have a size limit defined in $POST_MAX. You may need that to -1 (unlimited) or a large positive size. Also, uploads are disabled by default with CGI.pm; this is controlled by the $DISABLE_UPLOADS variable.

    Where did you find the upload_hook callback? I have never heard of it and can't find it in CGI.pm.

      The more I think about this challenge, the less I think the http solution will work and still be reliable (http certainly was not designed for it).

      As for upload_hook, I found it in the CGI.pm page on CPAN.org:

      You can set up a callback that will be called whenever a file upload is being read during the form processing. This is much like the UPLOAD_HOOK facility available in Apache::Request, with the exception that the first argument to the callback is an Apache::Upload object, here it's the remote filename.
      $q = CGI->new(); $q->upload_hook(\&hook,$data); sub hook { my ($filename, $buffer, $bytes_read, $data) = @_; print "Read $bytes_read bytes of $filename\n"; }

      If using the function-oriented interface, call the CGI::upload_hook() method before calling param() or any other CGI functions:
      CGI::upload_hook(\&hook,$data);

      This method is not exported by default. You will have to import it explicitly if you wish to use it without the CGI:: prefix.
        HTTP is just as reliable and fast as FTP. More reliable since there is only one connection instead of the separate data connection with FTP so less firewall worries. And you don't have to worry about a separate server with different authentication.

        Also, if you don't need other form fields (including the filename), then you can do a POST with the file content as the body. The receiving CGI would not use CGI.pm to parse the upload but read the body directly and save it to disk. The filename would have to be constructed by some other mechanism.

Re: CGI.pm and Large File Transfers
by hardburn (Abbot) on Mar 15, 2004 at 21:52 UTC

    CGI.pm saves it to a temporary file. Note, though, that allowing anyone to upload a file of any size like this is a security problem (CGI.pm's documentation on file transfers goes over this).

    HTTP wasn't designed with file uploading in mind, and I suggest finding a different solution.

    ----
    : () { :|:& };:

    Note: All code is untested, unless otherwise stated

      Thanks for the input. I think you are correct. Have any suggestions? I have a few ideas, but its always good to hear more options!

      thanks for the response!
Re: CGI.pm and Large File Transfers
by Joost (Canon) on Mar 15, 2004 at 22:43 UTC
    Q1 - As pointed out above, CGI.pm saves uploads to a temporary file so it can handle any file size as long as you have disk space left. I've handled over 600 Mb files with it without any problems except buggy Internet Explorers :-)

    Q2 - I never noticed that callback, but you probably won't be able to do the notification very efficiently in a CGI. What I do in situations like this is: on form submit I show an animated gif indicating the upload is taking place:

    <form ... onsubmit="document.images['imagename'].src='animation.gif';" +> .... </form>
    Or something like it (can't remember the excact javascript now)

    It's a lot easier and looks almost as good :-)

    Joost.

Re: CGI.pm and Large File Transfers
by tachyon (Chancellor) on Mar 15, 2004 at 23:46 UTC

    Lots of options other than CGI directly. You could use ftp, sftp, scp, ssh protected ftp, rsync, wget. You could use server pull rather than client push, letting the remote server pull the file from your local box. You could just do it with sockets. There are lots of ways to transfer files.

    Assuming that this is a regular upload you could just put the data into a browsable location locally (assuming you have a web server onsite), protect it with a .htaccess file and just set a cron job to get it whenever it suits using wget.

    cheers

    tachyon

Re: CGI.pm and Large File Transfers
by Anonymous Monk on Mar 16, 2004 at 05:25 UTC

    Uploading a 100 MB file through HTTP is sure to take a long time. Downloading a 100 MB is bad enough, but most people have a MUCH lower upload bandwidth than they do a download one. Chances are most likely that the web server you're working on times out HTTP connections after a certain period of time (my version of Apache has a default 300 second limit). If you're using Apache, you can change this timeoput limit so that the upload won't be cancelled, but I can't imagine the number of minutes or hours a 100 MB upload would take. I'd definitely look for another approach to this.

Re: CGI.pm and Large File Transfers
by Avox (Sexton) on Mar 16, 2004 at 19:37 UTC
    Thanks everyone. I think i'm going to use http to start a ftp transaction on the server. It will probably be a bit cleaner that way. Thanks so much for everyone's help!