in reply to Re: Remove Duplicates from a mbox file
in thread Remove Duplicates from a mbox file

I had, but it seemed like a little bit of overkill for what I was doing. And I got to learn a little more Perl doing it.

--
negativespace.net - all things inbetween.

  • Comment on Re: Re: Remove Duplicates from a mbox file

Replies are listed 'Best First'.
Re^3: Remove Duplicates from a mbox file
by Anonymous Monk on Oct 11, 2007 at 03:20 UTC
    I couldn't get the perl code above to work right, so I kept searching and I found the one on the web site below, It seems to work great! It removed 2400 duplicates from a 200MB mbox file. It also automatically creates a backup for you. www.wdr1.com/hacks/mbox-dedup.pl
      Yes, but beware it will skip messages which do not have a Message-ID header - and they won't be stored in the resulting file, so you'll have to keep the backup file nevertheless. However, all messages which were skipped will be output.