Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re: MIME::Tools to save attachment properly

by sachss (Sexton)
on Jan 24, 2017 at 02:33 UTC ( #1180190=note: print w/replies, xml ) Need Help??


in reply to MIME::Tools to save attachment properly

I am trying to parse an email from Yahoo with split-7z files attached to them. I was able to log in successfully into Yahoo with Net::IMAP::Simple, which was assigned to $server. my code
my $parser = MIME::Parser->new( ); my $entity = $parser->parse_data(join '', @{$server->get($i) +}); my $numParts = $entity->parts; my @parts = $entity->parts; if ($numParts > 0) { foreach my $part (@parts) { my $type = $part->mime_type; my $bh = $part->bodyhandle; print "MIME Type: $type\n"; if (defined $bh) { open(my $OUTFILE, ">", $bh->path) or die $!; binmode($OUTFILE); $bh->print(\$OUTFILE); close($OUTFILE); } #End IF $bh defined } # End For Each Part } # End If Num of Parts > 0
When I run this, I get

msg-1304-1.txt of size 826

msg-1304-2.html of size 2,661

_Archive.7z.046 of size 0, when it should be 8,975,069

All I care about is the _Archive.7z.001 thru _Archive.7z.046

Help me O' Great Monks

Replies are listed 'Best First'.
Re^2: MIME::Tools to save attachment properly
by Corion (Patriarch) on Jan 24, 2017 at 08:00 UTC

    Have you looked at the ->mime_type and the other attributes of the attachment?

    Are you certain that the attachment itself is OK in the mail?

    Have you inspected the raw data before parsing it? Maybe it is broken in some way that MIME::Parser doesn't handle?

      I liked at ->mime_type which is "application/octet-stream", but I am not aware of other information options.

      The attachment in the actual email is fine as far as I can tell. I have about 46 emails, each with a single attachment, which is about 10 Mb, which composes a 460 Mb 7z file as a whole.

      How do I check the raw data? I thought that was what I was trying to output, which comes out as zero-length, although, spot-checking would suggest it might be being overwritten for some reason, as I do see a non-zero length during then zero-length at the end.

        I checked the $entity data through Dump::Data, and looks like an actual email. I did find that
        my $bh = $part->bodyhandle; print "MIME Type: $type\n";
        will outputs all the parts.

        But

        my $bh = $part->bodyhandle; print "MIME Type: $type\n"; if (defined $bh) { open(my $OUTFILE, ">", $bh->path) or die $!; binmode($OUTFILE); $bh->print(\$OUTFILE); close($OUTFILE); } #End IF $bh defined
        the $bh-print, seems to 'overwrite' the attachments making them zero-length.

        So I removed that part and my code works. I do need to look at MIME::Parser::Filer more to clean up my code.

        I do appreciate everyone's input and this wonderful site.

        I would look at the data you accumulate here:

        my $entity = $parser->parse_data(join '', @{$server->get($i)});

        And write that to a file and try to find out whether the attachment payload is malformed or whether there are two attachments with the same suggested output filename etc.

        If you suspect that the output file is overwritten, consider running your program under truss or strace to find the API calls that are made. Also consider looking at MIME::Parser::Filer to avoid using (or at least output) the filenames that come in the mail and give your own filenames if you suspect overwriting going on.

        Maybe you can also remove most of the base64 encoded payload (if it has been proven to be valid) and replace it by something much smaller that you can post here so we can try to reproduce your problem.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1180190]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (2)
As of 2022-07-01 13:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My most frequent journeys are powered by:









    Results (98 votes). Check out past polls.

    Notices?