rrtrems has asked for the wisdom of the Perl Monks concerning the following question:

I built a form to allow attachments. I can upload the file successfully and I can send the email with the attachments successfully. The files have content there are no issues with that.</p.

My issue is with the file name. When the file name is uploaded it includes the entire path but without "/" so it's not parsing correctly when I use "fileparse"

EXAMPLE: When I send the file the file name ends up being C_Documents and SettingsrytremblMyDocuments9000_dollar_check_image.pdf

See the name runs together. Here is my code

sub openattachment { my $Buffer; my $BytesRead; my $Size = 0; my $FileLimit = 1024000; $theirFile = $attachmentFiles[$i]; $FileName = $attachmentFiles[$i]; my $tmpDir = "/tmp"; my $date_stamp; my $fileCount = $i + 1 ; #Get the date to stamp on the file &getDateAndTime; $date_stamp = "$mon$mday$year"."_"."$hour"."hr"."$min"."min"."$sec +"; # Parse FileName to remove path my ($name,$path,$extension) = fileparse($FileName,'\..*'); $FileName = $name . $extension; # Replace blank spaces with underscores #$FileName =~ s/\s/_/g; # Remove non safe characters #$FileName =~ s/[^a-zA-Z0-9_.-\\]//g; # Create temporary directory if it doesn't exist if ( ! -d "$tmpDir" ) { mkdir "$tmpDir"; chmod 0777,"$tmpDir"; } # Delete file if it exists if ( -r "$tmpDir/$FileName" ) { unlink "$tmpDir/$FileName"; } open(OUTPUT,">$tmpDir/$FileName"); while ($BytesRead = read($theirFile,$Buffer,1024)) { print OUTPUT $Buffer; $Size += $BytesRead; if ($Size > $FileLimit) { &error_message("Your file is over the file size limit."); close (OUTPUT); unlink "$tmpDir/$FileName"; $theirName = "NULL"; } } $myFilePath = "$tmpDir/$FileName"; close(OUTPUT); chmod 0666,"$tmpDir/$FileName"; }

Replies are listed 'Best First'.
Re: FileParse is NOT working correctly
by almut (Canon) on Apr 20, 2009 at 18:40 UTC

    As your server seems to be running some Unix variant (I'm going by the "/tmp"), you probably want to specify the path type (otherwise, "File::Basename will assume a file path type native to your current operating system...", i.e. $^O will be inspected).  In this particular case

    fileparse_set_fstype('MSWin32');

    (This of course assumes that the filename actually comes in as C:\Documents and Settings\... (with backslashes). In other words, to debug this, it's probably a good idea to just check (print out) the value of $FileName before you pass it to fileparse().)

      This appears to have been the problem. It's something I NEVER would have figured out!!!! Thank you so much for your input and help.
Re: FileParse is NOT working correctly
by ELISHEVA (Prior) on Apr 20, 2009 at 18:40 UTC

    Please see File::Temp for the preferred way to create temp files.

    As for the filename: where is it coming from? Just scanning the code I can't see anything that would mess up the file name. You've commented out the s/../.. lines so the problem can't be coming from there. fileparse certainly isn't inserting underscores where there should be spaces and colons (e.g. C: => C_). Are you sure the colon after the c and the directory dividers (\ or /) are in the filename when it gets passed to the function? Is this the output that you got before or after you commented out the substitution regex's?

    And where, by the way, are you setting the $i in $attachmentFiles? $i isn't normally used as a global variable. Is this really the subroutine that is generating this output? Or did you cut and paste something that used to be the guts of a for loop into a subroutine?

    To see what is really happening, you might try adding print STDERR "filename=<$FileName>\n" to the start of your subroutine.

    If you are passing the filenames in @attachmentFiles "as is" via the cygwin bash command line, your backslashes will disappear from the filename before it ever makes it into @ARGV. To keep the backslashes in the name, you need to single quote it on the command line or else use double backslashes, e.g.

    perl myscript.pl 'C:\Documents and Settings\rytrembl\MyDocuments\9000_ +dollar_check_image.pdf' perl myscript.pl C:\\Documents and Settings\\rytrembl\\MyDocuments\\90 +00_dollar_check_image.pdf</c>

    Best, beth

    Update added some more questions to OP.

      This is part of a huge form processing CGI. Nothing comes from command lines, everything comes from form input.

      Here is the code that retrieves the file from the form and the email that sends the attachment. The file name appears to have underscores after it is retrieved from the form for some reason????

      # ADD ATTACHMENTS TO FORM # Set the attachment variables my $FileName; my $FileName2; my $FileName3; my $myFilePath; my $i; my @attachmentFiles; # Get the attachment $FileName = $query->param('FileName'); $FileName2 = $query->param('FileName2'); $FileName3 = $query->param('FileName3'); @attachmentFiles = (); # Put the attachments in an Array # This allows each file to be filtered through read and attached if( !defined($FileName) || $FileName ne "") { push (@attachmentFiles, $FileName); if( !defined($FileName2) || $FileName2 ne "") { push (@attachmentFiles, $FileName2); if( !defined($FileName3) || $FileName3 ne "") { push (@attachmentFiles, $FileName3); } } }

      This pulls in the values from the form. There are three attachment fields so they get pushed to an array, but this problem has been there from the get go when there was only 1 file. My first thought was to simply rename the file with a time stamp, but I would rather retain the name of the file they uploaded.

      Here is the code that attaches it to the email. I had worked on this for months and don't understand why the path is being altered so that I can't get the file name easily.

      sub admin_mailer { print DEBUG "in admin mailer...\n" if ($debug_log); $admin = $query->param('admin'); # If $email begins "test:" then we are in a test mode. Send # the admin e-mail to the specified e-mail address. if ( $email =~ /^test:/ ) { $email =~ s/^test\://; $admin = $email; } @sendary = split ( /,/, $admin ); if ( $query->param('admin_subject') ) { $subj = convertInputString( $query->param('admin_subject') ); } else { $subj = convertInputString( $query->param('subject') ); } print DEBUG "Admin mailer converting...\n"; my $email = convertInputString( $query->param('EMail') ); my $fname = convertInputString( $query->param('FirstName') ); my $lname = convertInputString( $query->param('LastName') ); my $from = $fname . " " . $lname . " <" . $email . ">"; # # Give the e-mail client a hint of what charset to # expect. Some clients may ignore but at least we # try to behave nicely. # #&logEvent("Mailing admins @sendary From: $fname $lname"); #open( MAIL, "|$MAILPROG @sendary" ) or die "Cannot open $MAILPROG +: $!"; #print MAIL "MIME-Version: 1.0\n"; #if ( $inputCharset =~ /UTF\-8/i ) { # print DEBUG "Formatting Mail UTF-8\n" if ($debug_log); # print MAIL "Content-type: text/plain; charset=UTF-8\n"; #} #else { # print DEBUG "Formatting Mail ISO-8859-1\n" if ($debug_log); # print MAIL "Content-type: text/plain; charset=ISO-8859-1\n"; #} #print MAIL "Content-type: text/plain; charset=" . $outputCharset +. "; \n"; #print MAIL "From: \"$fname $lname\"<$email>\n"; #print MAIL "To: $admin\n"; #print MAIL "Subject: $subj \n"; #print MAIL "\n"; #print MAIL ""; @names = $query->param; @namelist = split ( /,/, $query->param(admin_mailer_fields) ); print DEBUG "namelist : @namelist\n"; if (@namelist) { # Go through the form hash from loadFormHash and get the d +esired fields in the order of # admin_mailer_values foreach $name (@namelist) { $message_body = $message_body . $name . ": " . $HASH{$ +name} . "\n\n"; } } else { foreach $record ( keys(%HASH) ) { $message_body = $message_body . $record . ": " . +$HASH{$record} . "\n\n"; } } ### Create the multipart container $msg = MIME::Lite->new ( From => $from, To => $admin, Subject => $subj, Type =>'multipart/mixed' ) or die "Error creating multipart container: $!\n"; ### Add the text message part $msg->attach ( Type => 'TEXT', Data => $message_body ) or die "Error adding the text message part: $!\n"; ### Add the file if(@attachmentFiles ne "" || @attachmentFiles ne "NULL"){ for ( $i = 0 ; $i < scalar ( @attachmentFiles ) ; $i++ ) { &openattachment; if($theirFile ne "NULL" || $theirFile ne "" ){ $msg->attach ( Type => 'AUTO', Path => $myFilePath, FileName => $FileName, Disposition => 'attachment' ); } } } ### Send the Message $msg->send; }
        The file name appears to have underscores after it is retrieved from the form

        So you're saying the path already has the underscores when you print out $FileName right after this(?):

        # Get the attachment $FileName = $query->param('FileName');

        If so, it's no surprise fileparse() has issues with it. In this case, the next step would be to investigate what the browser actually sends...

        One general debugging rule is to reduce the case to the smallest piece of code that does exhibit the problem. Why look at the mailing code when the problem is with parsing the filename? Remove everything that likely is unrelated. If the problem persists, it was unrelated. Otherwise, take the code back in stepwise until the problem reappears...  Also, always verify implicit assumptions. I.e. if you suspect that fileparse() is doing something wrong, you're implicitly assuming it's getting proper input. Verify it, print it out.  That way, you can usually narrow down rather quickly on where things are going wrong...

        And if you haven't found the error yourself by then, you've at least produced a small (and hopefully self-contained) piece of code which would allow others to reproduce the issue.

Re: FileParse is NOT working correctly
by afoken (Chancellor) on Apr 20, 2009 at 22:53 UTC

    When the file name is uploaded it includes the entire path but without "/" so it's not parsing correctly when I use "fileparse"

    This means you have tested your code only with the Internet Explorer. All other browsers I know just send the filename, not the path. This may bite you as soon as someone does not use the IE.

    Alexander

    --
    Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
      That is not true. I have used multiple versions of IE and Firefox to test this form.

        It is true, even if you do not want to hear it. Only MS IE sends the entire file name including drive letter and directories, all other browsers I know, including Firefox, Netscape, Opera and Konqueror, send only the file name. Copy the following file into a CGI enabled directory of a webserver as uploadtest.cgi and see for yourself: Open http://server/cgi-bin/uploadtest.cgi, choose any file, and click the submit button.

        #!/usr/bin/perl -Tw use strict; use CGI qw(:all); use Data::Dumper; if (request_method() eq 'POST') { my $f=param('f'); my $info=uploadInfo($f); print header(), start_html(), h1('Upload Metadata'), pre(escapeHTML(Dumper($info))), end_html(); } else { print header(), start_html(), start_multipart_form(), filefield(-name=>'f',-size=>50), submit(), end_form(), end_html(); }

        Result with Firefox 3.0.8:

        $VAR1 = { 'Content-Type' => 'application/octet-stream', 'Content-Disposition' => 'form-data; name="f"; filename="win +.ini"' };

        Result with IE 6.0.2800.1106:

        $VAR1 = { 'Content-Type' => 'application/octet-stream', 'Content-Disposition' => 'form-data; name="f"; filename="C:\ +\WINNT\\win.ini"' };

        OK, let's be paranoid and let's assume Lincoln D. Stein and me added some evil code into CGI.pm just to make your life harder. So let's get rid of CGI.pm and look at the raw, unparsed data. Copy the following script as uploadtest2.cgi into the CGI-enabled directory of the webserver:

        #!/usr/bin/perl -Tw use strict; print "Content-Type: text/html\015\012\015\012"; if ($ENV{'REQUEST_METHOD'} eq 'POST') { print "<html><body><plaintext>"; print while <STDIN>; } else { print '<html><body>', '<form method="post" action="" enctype="multipart/form +-data">', '<input type="file" name="f" size="50">', '<input type="submit">', '</form></body></html>'; }

        Result with FF:

        -----------------------------114782935826962 Content-Disposition: form-data; name="f"; filename="win.ini" Content-Type: application/octet-stream # file content here

        Result with IE:

        -----------------------------7d936e1f40214 Content-Disposition: form-data; name="f"; filename="C:\WINNT\win.ini" Content-Type: application/octet-stream # file content here

        Alexander

        --
        Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)