Faile has asked for the wisdom of the Perl Monks concerning the following question:
1st question:
Is there a library out there to read SMTP messages (written to disk in text form with attachments still encoded in various ways) and reliably extract the names of any attachments from them (regardless of sending client)?
Question answered by Alexander
2nd question:
Situation: I have an application that writes messages in raw format to disk ( it's a security backup feature incase an attachment gets lost somewhere ). I want to verify that all the attachments listed in the raw MIME messages are available in an archive directory. Due to application processing the output files have prepended filenames (timestamps) so they are not 1:1
Solution design:
- read all MIMEs and store expected filenames in an array
- index all the files in the archive and store those in an array
- look for each filename in the archive and warn if none is found
Parameters:
- There are several 10s of thousands of files
- Verifying is done sometimes by hand, not frequently
In this snippet i've replaced the input logic with two arrays with sample data:
#!/usr/bin/perl -w use warnings; use strict; my @attfiles = ( 'foo.txt', 'faa.xml', 'fii.pdf' ); my @arcfiles = ( 'x:\archive\1234567890123_foo.txt', 'x:\archive\1234567890123_fuu.xml', 'x:\archive\1234567890123_fii.pdf' ); foreach my $att (@attfiles) { my $found = 0; foreach my $arc (@arcfiles) { my $result = index($arc, $att); if ($result >= 0) { print "Found $att in $arc\n"; $found = 1; last; } } unless ($found) { print "WARNING: Could not find $att\n"; } }
My second idea was to replace the substr() with a simple match regexp because what interest me is "is it there?", not "where is it?".
#!/usr/bin/perl -w use warnings; use strict; my @attfiles = ( 'foo.txt', 'faa.xml', 'fii.pdf' ); my @arcfiles = ( 'x:\archive\1234567890123_foo.txt', 'x:\archive\1234567890123_fuu.xml', 'x:\archive\1234567890123_fii.pdf' ); foreach my $att (@attfiles) { my $found = 0; foreach my $arc (@arcfiles) { if ( $arc =~ m/$att/ ) { print "Found $att in $arc\n"; $found = 1; last; } } unless ($found) { print "WARNING: Could not find $att\n"; } }
There's most likely TMTOWTDI and I was wondering if this is something someone has already solved very neatly, perhaps in a different way?
Thanks for all your help in advance, I am but a humble padawan. :)
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Attachments in emails and finding matches and cross referencing
by afoken (Chancellor) on Nov 25, 2010 at 13:05 UTC | |
by Faile (Novice) on Nov 25, 2010 at 14:11 UTC |