in reply to Re^3: Mail::MboxParser pegs the CPU
in thread Mail::MboxParser pegs the CPU

> You can use something like this to create index:

I'm sorry to disagree, but - no, you can't. Per the docs:

enable_cache When set to a true value, caching is used but only if you +gave *cache_file_name*. There is no default value here! cache_file_name The file used for caching. This option is mandatory if *enable_cache* is true.

Neither of these is set in your code - and setting them does not create the specified file.

I've tried explicitly using '$mb->make_index' in my code, by the way - in which both of the above are defined (please see the code I originally posted.) The cache file still does not get created (and, yes, I do have write permissions in that directory; an 'open' call in the script creates one without any problem.) At this point, it's beginning to look like a module bug.


--
"Language shapes the way we think, and determines what we can think about."
-- B. L. Whorf

Replies are listed 'Best First'.
Re^5: Mail::MboxParser pegs the CPU
by zwon (Abbot) on Jan 15, 2010 at 20:39 UTC
    I'm sorry to disagree, but - no, you can't.

    Have you tried?

    $ perl mailbox_parser.pl /home/zwon/Mail/FreeBSD-security | head -n 20 00000 => 0000000000 => FreeBSD Security Advisory FreeBSD-SA-04:03.jail 00001 => 0000007411 => [FreeBSD-Announce] FreeBSD Security Advisory Fr +eeBSD-SA-04:04.tcp 00002 => 0000015717 => FreeBSD Security Advisory FreeBSD-SA-04:04.tcp 00003 => 0000024280 => [FreeBSD-Announce] FreeBSD Security Advisory Fr +eeBSD-SA-04:05.openssl 00004 => 0000032992 => FreeBSD Security Advisory FreeBSD-SA-04:05.open +ssl 00005 => 0000041959 => FreeBSD Security Advisory FreeBSD-SA-04:06.ipv6 00006 => 0000049688 => FreeBSD Security Advisory FreeBSD-SA-04:07.cvs 00007 => 0000058113 => FreeBSD Security Advisory FreeBSD-SA-04:08.heim +dal 00008 => 0000069503 => FreeBSD Security Advisory FreeBSD-SA-04:09.kadm +ind 00009 => 0000077885 => FreeBSD Security Advisory FreeBSD-SA-04:10.cvs 00010 => 0000086754 => FreeBSD Security Advisory FreeBSD-SA-04:11.msyn +c 00011 => 0000094967 => FreeBSD Security Advisory FreeBSD-SA-04:12.jail +route 00012 => 0000102596 => FreeBSD Security Advisory FreeBSD-SA-04:13.linu +x 00013 => 0000112415 => FreeBSD Security Advisory FreeBSD-SA-04:14.cvs 00014 => 0000123949 => FreeBSD Security Advisory FreeBSD-SA-04:15.sysc +ons 00015 => 0000131204 => FreeBSD Security Advisory FreeBSD-SA-04:16.fetc +h 00016 => 0000141362 => FreeBSD Security Advisory FreeBSD-SA-04:17.proc +fs 00017 => 0000151053 => FreeBSD Security Advisory FreeBSD-SA-05:01.teln +et 00018 => 0000160935 => FreeBSD Security Advisory FreeBSD-SA-05:02.send +file 00019 => 0000170195 => FreeBSD Security Advisory FreeBSD-SA-05:03.amd6 +4

    I just replaced 'mbox' with $ARGV[0] in my example.

    PS: note, that cache is not the same as the index

    PPS: and you should have Mail::Mbox::MessageParser::Cache installed in order cache to be created, don't know why it is not mentioned in Mail::MboxParser documentation

      > Have you tried?

      I have. I explicitly said so in the post to which you just replied.

      I can get a printout similar to the above, but as I understand it, it's just a demonstrator tool that lets you "peek at the index" - that's what the docs say. I'm not clear on how that's supposed to help, or speed up the retrieval - which is what I'm trying to do.

      If you know how I can use this to speed up the retrieval, please enlighten me.

      P.S. I have both Mail::Mbox::MessageParser::Cache and Mail::Mbox::MessageParser::Grep installed - they were installed even before I posted my question here.


      --
      "Language shapes the way we think, and determines what we can think about."
      -- B. L. Whorf
        If you know how I can use this to speed up the retrieval

        First you have to build index and save it into a file:

        use strict; use warnings; use Mail::MboxParser; use Storable; my $mbox = shift; # name of the mailbox file my $mb = Mail::MboxParser->new( $mbox ); $mb->make_index; my @index; # Build index. I'm adding message position and subject # into index, but you can add also other fields. for ( 0 .. $mb->nmsgs - 1 ) { push @index, [ $_, $mb->get_pos($_), $mb->get_message($_)->header- +>{subject} ]; } # save index into file store \@index, "$mbox.idx";

        Then you can retrieve index from the file, get position of message you want, and directly read that message. The following example takes mbox file name as first argument and message number as second.

        use strict; use warnings; use Mail::MboxParser; use Storable; my $mbox = shift; my $num = shift; # loading index my $index = retrieve("$mbox.idx"); unless (defined $num) { # print index print join(" => ", @$_), "\n" for @$index; } else { # print message $num die "No such message" if $num >= @$index; my $mb = Mail::MboxParser->new( $mbox ); $mb->set_pos($index->[$num][1]); print $mb->next_message_old; }