http://qs1969.pair.com?node_id=1227686

Arik123 has asked for the wisdom of the Perl Monks concerning the following question:

Does Email::MIME support attachments of files with utf8 filename?

My code crashed, and I think that may be the problem. Can anyone verify it?

EDIT:

I have a text file which is MIME-formatted (this is actually an email I got, and saved as a file. Never mind). The file itself is long, but I think the relevant part is:

--00000000000076fecc057dd5edea-- --00000000000076fece057dd5edec Content-Type: application/pdf; name="=?UTF-8?B?157Xmdec15kg15PXkNeR15X +Xqi5wZGY=?=" Content-Disposition: attachment; filename="=?UTF-8?B?157Xmdec15kg15PXk +NeR15XXqi5wZGY=?=" Content-Transfer-Encoding: base64 Content-ID: <167836db7a888894ca51> X-Attachment-Id: 167836db7a888894ca51

As you see, there's an attachment in the message, which happens to be a pdf file with non-ascii name. Now, my code is:

#!perl use Email::MIME; $file = shift || "g"; open G, $file; $g = join '', <G>; close G; $e = new Email::MIME ($g);

The code generated the following message:

Unquoted '"' not allowed at ./chfn.pl line 6.
Missing semicolon before parameter '"מילי דאבות.pdf"' at ./chfn.pl line 6.

The error is not fatal, that is, the code continues to run after generating this message. However I cannot extract the filename of the attachment using Email::MIME methods. Now, Consider this:

perl -pe 's/name=".*"/name="fn.pdf"/g' g >g2 ./chfn.pl g2

The result is no errors. I conclude that the problem is the non-ascii filename of the attachment. BTW ths OS is Linux, not Windows.

Thanks a lot!

Replies are listed 'Best First'.
Re: Email::MIME support for utf8 filename
by Corion (Patriarch) on Dec 25, 2018 at 10:37 UTC

    Please help us help you better by answering the following questions:

    1. What is your code? Please edit your post and add the relevant code to it. Ideally you replace all variable parts that are read from files with hardcoded values. Make sure that the problem persists with the hardcoded values.
    2. How did your code "crash"? Please edit your post and add the exact error message you got from your code. We can't help you much without the error message, the line number and the code matching to that line number.
    3. Have you made sure that the problem is related to UTF-8-encoded filenames at all? Does the same code work with plain ASCII filenames? Please try out (and tell us about) the variations to find out the root cause of the problems.
    4. Are you certain that the filename is UTF-8? Not all filesystems and file system APIs encode non-ASCII filenames as UTF-8. For example on Windows, you need to use special versions of the file system API (the "Wide" functions) to access filenames with non-ASCII parts. See for example Win32::LongPath for a module that uses the Wide APIs.

      Does my edit (to the original post) satisfy you?

        Yes, that makes it much easier to find where the warning originates:

        The warning comes from Email::MIME::ContentType line 106:

        if ($STRICT_PARAMS and length $ct and $ct !~ /^;/) { carp "Missing semicolon before first Content-Type parameter '$ +ct'";

        So, one very simplistic workaround is to set $Email::MIME::ContentType::STRICT_PARAMS to 0 to suppress this warning.

        I don't know where/why the content type string goes awry, but that's to debug another day...