http://qs1969.pair.com?node_id=11139826

jkeenan1 has asked for the wisdom of the Perl Monks concerning the following question:

What I Want to Do: Be able to study the time distribution of the arrival in my Inbox of newsletters to which I subscribe.

Current Situation: I subscribe to 3 newsletters about which I receive a summary email 6 or 7 days per week. For years these summary emails would arrive at predictable times: 1:00 am local time (America/New York) for newsletters originating in Australia or the U.K.; 8:00 am local time for a newsletter originating in the U.S.A.

In the last 6 months the arrival times of these summary emails (which I believe use a common technology) have become very erratic -- as much as 16 to 24 hours later than normal. Since I am an active commenter on these newsletters, this is annoying.

One of the 3 newsletters has already begun moving to a different distribution technology. I would like to encourage the other 2 newsletters to do likewise. To that end, I would like to provide them with data about the actual arrival times in my Inbox of their summary emails.

I use Thunderbird as my email client to Pobox's IMAP server. I need to get the data out of Thunderbird and into a plain-text format (probably CSV) so that I can write Perl code to analyze it. (In a later stage, I will need to locate CPAN modules that will enable me to, say, create a scatter-point diagram with which to display the message arrival time data.)

What I Already Know How to Do:

Has anyone faced this problem previously who can point me toward a way to get the data out of Thunderbird and into a plain-text file?

Jim Keenan

Replies are listed 'Best First'.
Re: Get saved search data out of Thunderbird and into plain-text file
by Corion (Patriarch) on Dec 22, 2021 at 15:09 UTC

    I'm not sure if Thunderbird stores the local copies as SQLite databases or the weirdo Mozilla internal format (Mork, more comments by jwz). According to some random comments on the internet, global-messages-db.sqlite in your profile directory is the index into all the (local) messages, so you might get by with running the SQL against that file (DBD::SQLite and/or my DBIx::RunSQL).

    If you are lucky, getting the results is merely running the SQL query against the SQLite database in the INBOX file (which might be an SQLite database).

    If you are unlucky, you need to read/import the Mork format into SQLite to then run the SQL query against that.

    If you are adventurous, you can try to create a converter from SQL to IMAP queries to run the search query directly on the remote IMAP server, but I think that's the approach least likely to be successful.

    Update:

    Having looked at the .msf files, they seem to contain maybe IMAP queries, but certainly not SQL.

    On the other hand, the global-messages-db.sqlite file in your profile contains basically a copy of all mails as a cache for Thunderbird.

    I would look at querying that SQLite file. You might want to maybe find out how to make SQLite ignore locks on the sqlite database so that you can read it while Thunderbird is still running.

      Thanks, Fletch and Corion for your suggestions. (I haven't gotten to the other posts yet, as I just noticed them.)

      If this were a commercial project, I would definitely explore Mail::Box. However, as this is just a one-off for a friend, I was hoping to avoid having to learn that module or, for that matter, SQLite.

      By poking around in ~/.thunderbird, I've learned that Thunderbird stores most of its own data in SQLite3, but the Mork format which Corion mentioned. Again, more than I really want to learn. Fortunately, the contents of any given subfolder are stored in a plain-text file whose path looks something like:

      ~/.thunderbird/NNNNNsNn.default/ImapMail/mail.pobox.com/INBOX.sbd/subfolder_name

      ... which is what I'm now playing around with.

      Jim Keenan

        My guess (don't use Thunderbird) would be that's the mbox file which Mail::Box would read, but if you can kludge out what you're interested in that's probably enough for what you're trying to do.

        The cake is a lie.
        The cake is a lie.
        The cake is a lie.

Re: Get saved search data out of Thunderbird and into plain-text file
by Fletch (Bishop) on Dec 22, 2021 at 14:31 UTC

    I'm not 100% sure but a quick ddg'ing seems to say that Thunderbird uses either maildir or mbox files under the hood so you prossibly could use something like Mail::Box to access things (haven't used it in aeons but I believe it should handle either of those formats). Read whatever you're interested out into (say) a sqlite3 DB and then do whatever.

    The cake is a lie.
    The cake is a lie.
    The cake is a lie.

Re: Get saved search data out of Thunderbird and into plain-text file
by NERDVANA (Deacon) on Dec 22, 2021 at 21:27 UTC
    Another option (though maybe more effort) is to use Net::IMAP::Simple to connect to your imap server and list out the relevant messages, assuming all you needed was subject line and timestamp.

    my $email_date= DateTime::Format::Mail->new->loose; my $imap= Net::IMAP::Simple->new($host, ...) or die; $imap->select($mailbox) or die; @message_ids= $imap->search("(SUBJECT foo)") or die; for my $msg_id (@message_ids) { my $header_lines= $imap->top($msg_id) or die; my $head= MIME::Head->from_file(\join('', @$header_lines)); my $date= $email_date->parse_datetime($head->get("Date")); my $subject= $head->get("Subject"); printf "%s %s : %s\n", $date->ymd, $date->hms, $subject; }

    There's a bit more work to do if you have UTF-7 subject lines or need to unpack the date timestamp from the Received headers rather than the sender's date header.

Re: Get saved search data out of Thunderbird and into plain-text file -- IMAP monitoring
by Discipulus (Canon) on Dec 23, 2021 at 10:38 UTC
    Hello

    me too I'd go for something similar to NERDVANA's ++solution: if the server is IMAP then is easier to manipulate than the client.

    You can easely readapt the following code that uses Mail::IMAPClient

    use strict; use warnings; use Mail::IMAPClient; use Term::ReadKey; use Getopt::Long; my $VERSION = 3; my $user; my $server; my $port; my $ssl; my $imap_folder = "INBOX"; my $sleep = 60; GetOptions ( "u|user=s" => \$user, "s|server=s" => \$server, "p|porta=i" => \$port, "ssl=i" => \$ssl, "f|folder=s" => \$imap_folder, "i|interval=i" => \$sleep, ) or show_help(); sub show_help{ print <<EOH; $0 usage: $0 -user LOGIN -server SERVER -port N [ -ssl 0|1 -folder FOLDER +-interval SECONDS ] -u -user username to authenticate to the specific server -s -server IMAP -p -port -ssl (1 or 0) -f -folder IMAP to monitor ( default: "INBOX" ) -i -interval aka sleep time between checks (default 60) EOH exit; } my %check = (user=>$user,server=>$server,port=>$port,ssl=>$ssl); foreach my $need ( keys %check ){ unless ( defined $check{$need} ){ warn "\nplease specify the parameter: $need"; show_help(); } } print "\n(termite the program using CTRL-C to permit a clean IMAP logo +ut)\n"; print "put the pwd for $user at $server\n"; my $password; ReadMode('noecho'); $password = ReadLine(0); chomp $password; ReadMode 'normal'; my $imap = Mail::IMAPClient->new( Server => $server, User => $user, password => $password, Port => $port, Ssl=> $ssl, Uid=> 1, ) or die "IMAP Failure: $@"; print "\n$user correctly authenticated on $server port $port\n". "check for new messages in $imap_folder every $sleep seconds\n +"; # Handle Ctrl-C $SIG{INT} = sub{ print "\n\nlogout..\n"; $imap->logout(); print "exiting..\n"; exit; }; $imap->Peek(1); $imap->select( $imap_folder ) or die "IMAP Select Error for imap folde +r [$imap_folder]: $@"; my $now = time; my %seen = map { $_ => 1} $imap->sentsince($now); while (1){ my @msgs = $imap->sentsince($now); foreach my $msg (@msgs){ next if $seen{$msg}; my $from = $imap->get_header( $msg, "From" ); my $subj = $imap->get_header( $msg, "Subject" ); print "New message:\n". "\tFROM: $from\n". "\tSUBJ: $subj\n"; $seen{ $msg }++; ##################################### # CHANGE HERE AS NEEDED ##################################### system(1, 'echo', "new msg from $from subj $subj"); } sleep $sleep; $now = time; }

    L*

    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
Re: Get saved search data out of Thunderbird and into plain-text file
by bliako (Monsignor) on Dec 23, 2021 at 08:13 UTC

    Here are a few workarounds:

    • Change your mailbox settings (for that specific account) so that all your emails are also copied to a local folder as they arrive (check 'Copies & Folders'). If you are a user of those email providers which regularly lock their users out of their accounts because they refuse to give a telephone number (or their dob) or have "detected unusual activity" (typical harassment routine for us, "outlier" users) then you must have done this already. This entails downloading thousands of emails to your local disk though once you tick that option. Not ideal if you are on a slow connection or have some kind of quota. Once you do it then you will be able to see your emails in folder you mentioned in a single file. It will probably be the biggest there.
    • In some Thunderbirds, there is a 'Quick Filter' button for each mailbox (this is on the left of 'Search' but it is not the 'Search'). Use that to filter your desired emails. In the Display you will now have only the emails you filtered. Select them all (ctrl-a), right-click and 'Save as' into a new temporary folder. Each email will be saved into that folder in .eml format for further processing using Perl.
    • Create a MailExtension plugin. It's javascript. Plugins are distributed as xpi files which are just zip archives. You can inspect a simple one here, or use this tutorial.
    • There are a couple of Thunderbird extensions for creating mail statistics. See this or this.

    So, plenty of options to keep you in front of your computer during the festive season (wasn't that what you were looking for really? hehe)

    bw, bliako

Re: Get saved search data out of Thunderbird and into plain-text file
by perlfan (Vicar) on Dec 28, 2021 at 22:28 UTC
    It seems, perhaps that you can selectively export emails, based on https://www.ionos.com/help/email/other-email-programs/mozilla-thunderbird-exporting-emails/. I seem to recall I've done this before but for moving to another laptop. It says it's in EML format, which according to the "specification" described in the the EML link, contains data/time info; which is what you're needing. For post processing, I found the reverse of what you might need but it might offer some hints, Mail::Convert::Mbox::ToEml - it's by RJBS so he may also be a good contact.
    • Standard Method
    • Launch Thunderbird.
    • Select your Inbox or another folder.
    • Select the email you want to export. Or press CTRL+A to select all emails.
    • Click the menu button to display the Thunderbird menu.
    • Select Save as > File.
    • Select the folder where the emails should be saved and click Save.
    hth