in reply to Re^6: extracting name & email address
in thread extracting name & email address
Seems the attachment stripper either doesn't work, or I have it coded incorrectly. Am currently running a script without the attachment stripper. It has been running for an hour or so, and those messages appear all the time - "Complex regular subexpression recursion limit (32766) exceeded at /usr/share/perl5/Email/Address.pm line 108." This most likely has something to do with a regex on the attachments. Hence the need to process email files without attachments.
Was looking through some small scripts I have here that look only for "From:", "To:", "Cc" and "Bcc". That led to using Email::Simple ..
#!/usr/bin/env perl # use strict; use warnings; use File::Find; use Email::Simple; use File::Slurp qw( read_file ); my $directory = '/home/******/Mail/.family.directory/Browne, Bill & Ma +rtha'; my $outfile = 'output2.txt'; my @found_files; find( sub { push @found_files, $File::Find::name }, $directory ); foreach(@found_files) { my $file = "$_"; if (-f $file) { print $_,"\n"; my $intext = File::Slurp::read_file( $file ); my $mail = Email::Simple->new($intext); my $from_header = $mail->header("From"); my $to_header = $mail->header("To"); my $date_header = $mail->header("Date"); my $cc_header = $mail->header("CC"); my $bcc_header = $mail->header("BCC"); my @emails = ""; push @emails, ($from_header, $to_header); if( length $cc_header ) { push @emails, $cc_header; } if( length $bcc_header ) { push @emails, $bcc_header; } File::Slurp::write_file( $outfile, {append => 1 }, join("\n", @ema +ils ) ); } }
This took about 2 seconds to process all the 592 emails, and successfully output the names and emails to a file. Just a few observations:
My use of an array needs improving
I'm unsure if the $mail->header("CC"); will also read line/s with "Cc" or "cc". The same is true for BCC.
Where there are a lot of emails, I need to format them so that every "," is replaced so that it becomes a seperate entry in the array. At present it is one large string with email names/address, seperated with a comma. (Will need to be careful where a "," is in the name though). How do I do that ?
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^8: extracting name & email address
by peterr (Scribe) on Mar 04, 2015 at 00:21 UTC | |
by tye (Sage) on Mar 04, 2015 at 04:18 UTC | |
by peterr (Scribe) on Mar 04, 2015 at 05:25 UTC | |
by peterr (Scribe) on Mar 11, 2015 at 08:28 UTC |