rhxk has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I wrote a little program that reads a linux standard mbox, screen scrapes a website to get URL's associated with the email subject, and creates an rss from it....what I want to do is remove/delete the email I read..and so far I haven't found a way. Attached is the script. I'm open to any kind of suggestions...thanx...
#!/usr/bin/perl -w $\ = "\n"; select(STDERR); $| = 1; select(STDOUT); $| = 1; use XML::RSS; use WWW::Mechanize; use Mail::Util qw(read_mbox); use Mail::Internet; use Mail::Header; $mess_list = read_mbox('/var/spool/mail/robert') || die "Aieee\n"; $rssfile = '/var/www/html/rss.xml'; $rss = new XML::RSS (version => '2.0'); $m = WWW::Mechanize->new(); $url = "http://cnn.com/rss/index.html"; $x = '\$'; foreach my $message (@{$mess_list} ) { my $mail = Mail::Internet->new($message); my $head = $mail->head; # A Mail::Header; $from = $head->get('From'); $to = $head->get('To'); $date = $head->get('Date'); $subject = $head->get('Subject'); $subj = substr($subject,0,50); $subj =~ s/&/&amp;/g; $subj =~ s/"/&quot;/g; $subj =~ s/\\n/ /g; $subj =~ s/\$/$x/g; @habib = split(/\s+/,$subj); $subj = join(' ',@habib); chomp ($from); chomp ($subject); chomp ($date); chomp ($subj); chomp ($to); if ($to =~ /mysite.com/) { if (! defined($www)) { $www = 1; $m->get($url); $c = $m->content; } $link = $url; if (! -e $rssfile) { &new_file; foreach $line (split("\n",$c)) { if ($line =~ /<li> <a name=\"(\d\d\d\d\d\d\d?)\" href= +\"msg(\d\d\d\d\d\d\d?).html\">(\s+)?$subj/) { $link = "http://mysite.com/testrss/msg$2.html"; last; } } @body = &getacut($link,@{$mail->body}); $rss->add_item(title => "$subject", link => $link, descrition => "<pre>@body</pre>", mode => 'insert' ); $rss->save($rssfile); } else { $rss->parsefile($rssfile); pop(@{$rss->{'items'}}) if (@{$rss->{'items'}} == 50); foreach $line (split("\n",$c)) { if ($line =~ /<li> <a name="(\d\d\d\d\d\d\d?)" href="m +sg(\d\d\d\d\d\d\d?).html">(\s+)?$subj/) { $link = "http://mysite.com/testrss/msg$2.html"; last; } } if ($link eq $url) { print $subject; print $subj; print ''; } @body = &getacut($link,@{$mail->body}); $rss->add_item(title => "$subject", link => $link, description => "<pre>@body</pre>", mode => 'insert' ); $rss->save($rssfile); } # print "Mail from : $from" . # "Mail to : $to" . # "Subject : $subject"; } # print "Mail from : ",$head->get('From'), # "Mail to : ",$head->get('To'), # "Subject : ", $head->get('Subject'),"\n"; # foreach my $body_line (@{$mail->body}) { # do something with each line of the message # print $body_line; # } $mail->delete($message); } chmod(0644,$rssfile); exit(0); sub getacut { my ($a,@a) = @_; undef(@b); my $count = 0; foreach my $line (@a) { $count++; push(@b,$line); if ($count > 10) { push (@b, "<br><a href=\"$a\">Read More...</a>"); return(@b); } } } sub new_file { $rss->channel(title => 'Test Network', link => 'http://www.mysite.com', language => 'en', description => 'Test Network', rating => '(PICS-1.1 "http://www.classify.org/s +afesurf/" 1 r (SS~~000 1))', copyright => 'GroonG', pubDate => $date, lastBuildDate => '', docs => 'http://www.mySite.com', managingEditor => 'Robert', webMaster => 'rob@mysite.com' ); }

Replies are listed 'Best First'.
Re: reading mbox & deleting it
by bart (Canon) on May 10, 2007 at 06:15 UTC
    what I want to do is remove/delete the email I read..and so far I haven't found a way.
    Do you have access to the mailbox via POP3? In that case, using Net::POP3 makes it easy.

    Otherwise, if you insist on doing this with file operations: IMO that's risky, with multiple processes accessing the mailbox at the same time — new mail arriving, for example. Be very careful you get the file locking right.

      Maybe Mail::Box is good for you. It has a lot of features, one of the handy ones being that you can easily save into an extra 'deleted' folder the messages that you are deleting (just in case... ;) ).