peterr has asked for the wisdom of the Perl Monks concerning the following question:

I have been using the script below (from Mail::Box) to parse through an mbox folder. It displays how many files are in the folder and for each file, the attchments contents ..

#!/usr/bin/perl # Print the types of messages in the folder. Multi-part messages will # be shown with all their parts. # # This code can be used and modified without restriction. # Mark Overmeer, <*********@overmeer.net>, 9 nov 2001 use warnings; use strict; use lib '..', '.'; use Mail::Box::Manager 2.00; use Mail::Box::Dir; use File::Slurp qw( write_file ); my $outfile = 'output5.txt'; sub show_type($;$); # # Get the command line arguments. # die "Usage: $0 folderfile\n" unless @ARGV==1; my $filename = shift @ARGV; # # Open the folder # my $mgr = Mail::Box::Manager->new; my $folder = $mgr->open ( $filename , extract => 'LAZY' # never take the body unless needed ); # which saves memory and time. die "Cannot open $filename: $!\n" unless defined $folder; # # List all messages in this folder. # my @messages = $folder->messages; print "Mail folder $filename contains ", scalar @messages, " messages: +\n"; File::Slurp::write_file( $outfile, {append => 1 }, $folder->readMessag +eFilenames($filename) ); my $counter = 1; foreach my $message (@messages) { printf "%3d. ", $counter++; print $message->get('Subject') || '<no subject>', "\n"; show_type $message; } sub show_type($;$) { my $msg = shift; my $indent = (shift || '') . ' '; # increase indentation print $indent, " type=", $msg->get('Content-Type'), ', ' , $msg->size, " bytes\n"; if($msg->isMultipart) { foreach my $part ($msg->parts) { show_type $part, $indent; } } } # # Finish # $folder->close;

The number of fies is not correct, so I have modified it with the line

File::Slurp::write_file( $outfile, {append => 1 }, $folder->readMessageFilenames($filename) );

I need to display the actual filename for each file found. Nothing is being written to the $outfile though. The documentation for "readMessageFilenames" is at Mail::Box::Dir How can I get the filenames to display please ?

Replies are listed 'Best First'.
Re: Display filenames in mbox folder
by Anonymous Monk on Mar 01, 2015 at 00:56 UTC
    The documentation for "readMessageFilename" is at Mail::Box::Dir

    The source of readMessageFilenames in Mail::Box::Dir is "sub readMessageFilenames() {shift->notImplemented}", so $folder is likely one of its subclasses, Mail::Box::MH or Mail::Box::Maildir, can you tell us which one? (try print ref $folder;)

    More information would be useful. What is $filename - a single file, or a directory? What does "The number of files is not correct" mean? Most importantly, what files are in the folder you are looking at; i.e. what is your expected output? Also, Basic debugging checklist and How do I post a question effectively?

      The source of readMessageFilenames in Mail::Box::Dir is "sub readMessageFilenames() {shift->notImplemented}"

      I just downloaded version 2.118 of Mail::Box and yes, "readMessageFilenames" is unimplemented.

      so $folder is likely one of its subclasses, Mail::Box::MH or Mail::Box::Maildir

      Yes, in the same vers 2.118 I can see that Mail::Box::MH and Mail::Box::Maildir seem okay (implemented)

      can you tell us which one? (try print ref $folder;)

      The print displays Mail::Box::Maildir

      More information would be useful. What is $filename - a single file, or a directory? What does "The number of files is not correct" mean? Most importantly, what files are in the folder you are looking at; i.e. what is your expected output? Also, Basic debugging checklist

      $filename is a directory. The number of files should be 592. The script posted only parses through 591 files, and also displays 591 files. Other small perl scripts I have run state 592 files. I need to determine which file is being missed, and then determine the reason why.

      I'm using the Perl debugger quite a lot with this. Thanks. :)

        readMessageFilenames in Mail::Box::Maildir does exclude some files. Look for the file whose name does not match the pattern /^([0-9][\w.:,=\-]+)$/. It also does some trickery with the numbers that the filenames are supposed to start with, so if you've got two files that are named something like like "0012345abc" and "012345abc", one of those is going to get ignored.