in reply to Re: A Perl-app for twingling
in thread A Perl-app for twingling

Well, I have written up as much as I could. It is hard to describe, but anyone who has tried out any of Zoe, Six Degrees, or Emila will know instantly what I talking about. They are all free, so those who are interested can download and try them out. I have provided links above. Nonetheless, I will try to describe it here.

Most apps that search email search for matches against user-provided text strings, or some email headers or some combo thereof. The search is initiated by the user. There is nothing serendipitous.

The above apps are different. Let me try to describe it. You click on a date in a calendar, and all the emails for that date show up. You click on an email, and the text of that email shows up with all the html links, all the email lists, attachments, and people mentioned in that email neatly categorized on the side. Click on any one of them and related info shows up. It is like wandering through a maze... every turn brings up a new view.

Here is one analogy... Netflix vs. Blockbuster. Without starting a religous war here, this is why I like BB over Netflix. With Netflix I have to know what I am looking for. I can search on genres and keywords and the app responds with films that match that. With BB I don't have to know anything. I simply wander the aisles and come across new films. The only controlling factor is the BB employee who placed them in the particular aisles. Its like comparing an online catalog with a library. They serve different functions. One can simply wander in a library and be delighted by different things that turn up with every turn one takes.

Replies are listed 'Best First'.
Re: Re: Re: A Perl-app for twingling
by perrin (Chancellor) on May 18, 2004 at 22:33 UTC
    Well, that sounds pretty straightforward actually. You need to index your mail based on various criteria, and then create a way to view them that links to other things related through those criteria. The obvious approach would be to use a database, build an indexer program that you can feed your mail to, and build a web application for displaying and cross-linking. The first step would be to decide which things you will cross-link on and work on creating a database schema and figuring out how to extract them from an e-mail.
      Yup, the concept is straightforward. And once visualized, it is really elegant and addictive.

      The problem is building something like this. Which is why I asked originally if someone knew of any existing opensource, Perl-based effort on this.

      Re your suggestion, I don't believe a database is a good option. A database creates another, well, database. No, no. I would envision this app to be merely another view, albeit a value-added view, to the existing mail data store so that the view would always be up-to-date. Otherwise one would have to worry about synchronization. That is the problem with Zoe. Once it has imported everything, it has imported everything, even if it is spam or junk or just plain useless. After that even if one deletes those worthless messages from the mailboxes, they still remain in Zoe.

      Otoh, a database might be a useful, additional add-on, but syncing would be very important.

      Anyway, thanks for the ideas.

        If you want to search things -- dates, senders, URLs, related text, whatever -- quickly, you need to build an index. Looking through 2GB of e-mail on the fly to see what other stuff this person sent will take way too long. You could create your own indexing system, but there's not much point when free databases are available.

        It's pretty easy to make your mail system run a script to index each new e-mail you get when it arrives, or to look through your mailbox once an hour and add anything that's new.

        You can't do this without some sort of database. At least, not for large mail stores. In order to display a view, you must have a model for it. Searching through every email for every view change would take too long for large mail stores; thus you must index.

        That doesn't necessarily mean using an SQL database, though the flexibility would be usefull in this case, IMHO.

        To fix your problem, whenever mail is deleted, it needs to be removed from the database. Sounds like it would be good to combine this with a mail client. Otherwise, a periodic re-scan of the mail store is needed to delete old links.

        Perhaps storing a hash of each message to ID it and checking to see if it still exists?

        There is also the problem of determining what kind of data storage this will look at. Are you wanting to parse mailbox format, Maildir format, or IMAP?

        How about a POP3 proxy that indexes all messages as it downloads from the server?

        The perltwingular interface could have functions to do deletions and management too, to get rid of spam and junk. Heck, an interface to Spamassassin could help with that.

        ... I don't believe a database is a good option. A database creates another, well, database. No, no. I would envision this app to be merely another view, albeit a value-added view, to the existing mail data store so that the view would always be up-to-date.
        Perhaps you need a paradigm shift. This system, once built and proven reliable, might be regarded as the "authoritative" archive of your email, rather than the old mbox (or whatever) files.

        Given a decent interface, most email clients could deal directly with the twingle store rather than the intermediate mailbox.

        Matt