hacker has asked for the wisdom of the Perl Monks concerning the following question:
One of the phases of the portal system I've been working on is designed to allow users to upload files (Palm documents, ebooks in Palm format, etc.) to allow other users to download and use them. I have the file upload portion of my script working at a very rudimentary level, and looks like this:
use strict; use Env; use CGI qw(:standard); use Digest::MD5 qw(md5 md5_hex md5_base64); use POSIX qw(strftime); use Date::Manip; $ENV{'PATH'} = '/usr/bin:/bin:'; my $query = CGI->new; my $modtime = scalar(gmtime(time - (3600 * 8))) . " GMT"; my $exptime = scalar(gmtime) . " GMT"; ###################################################### # # Form stuff asking for the upload, title, filename, # author, and other relevant file-specific data here # ###################################################### # ... sub print_results { my $query = shift; my ($length, $filename, $filetype); my $directory = "/tmp/palm"; my $file_name = $query->upload('pl_upload'); if (!$file_name) { print "No file received..\n"; return; } $file_name =~ s/.*[\/\\](.*)/$1/; print h3("File Name"), $file_name; my $md5file = md5_hex($file_name); open(SAVEPDB,">$directory/${file_name}_${md5file}") or die $!; while (<$file_name>) { print SAVEPDB $_; $length += length($_); } close SAVEPDB; # Print file data here, size, title, etc. }
From here, I simply print the results of the file's size, type, title, and other form elements entered for diagnostics. This part works perfectly.
Thanks to tye, ChemBoy, and ferrency's help and suggestions earlier today I will be using a newer construct using IPC::Open2 instead of my current system() call to retrieve the compression type stored inside the binary file itself (DOC or zlib).Note that I'm saving the file as an md5sum'd filename, to avoid collisions with duplicate documents being uploaded, etc.
Here's the rub, I need to find a good workable design to allow thousands of users to upload files and content in this format to the server, which will sit in an approval queue, before being made "live" on the site for others to download.
What is the best approach to doing this? Blob them in a MySQL database? Store on the filesystem? Both? And doing so, how do I track which filename belongs with which "actual" file, so when the file(s) are listed on a webpage for download, the title is something human readable, not 'e3206099b8ad73408762ab0ea5e8f1f2'.
I've never done something like this before (tracking, storing persistant files/file data), so I'm a bit green. I'd eventually like the whole process of approval to be web-based, but for now I can deal with some manual intervention at the filesystem or database level. My concerns are:
Each file uploaded will have several bits of information associated with it, such as:
To that end, I've come up with the following basic schema for a table to hold this data:
CREATE TABLE sample_uploads ( sample_id tinyint(4) NOT NULL default '0', sample_submit_date datetime NOT NULL default '0000-00-00 00:00:00', sample_remote_addr text NOT NULL, sample_remote_host text NOT NULL, sample_remote_ua text NOT NULL, sample_db_title text NOT NULL, sample_user_name text NOT NULL, sample_copyright tinyint(4) NOT NULL default '0', sample_image_depth tinyint(4) NOT NULL default '0', sample_category tinyint(4) NOT NULL default '0', sample_description text NOT NULL, sample_filename text NOT NULL, PRIMARY KEY (sample_id) ) TYPE=MyISAM;
Does this approach hold water? The only bits I think I'm missing, are what to do with the file(s) sent, how to access them, and how to make sure the user sees "human" content (filenames, titles), while the system sees "protected" (md5sum) content.
Constructive ideas and architecture approaches are welcome. Thanks.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
(jeffa) Re: Designing storage of uploaded files
by jeffa (Bishop) on Aug 20, 2002 at 03:06 UTC | |
|
Re: Designing storage of uploaded files
by thraxil (Prior) on Aug 20, 2002 at 15:00 UTC | |
|
Re: Designing storage of uploaded files
by blokhead (Monsignor) on Aug 20, 2002 at 16:36 UTC |