in reply to Help, RDBMS have taken over my brain!

One technique that I see frequently splits the message body from the header info. The message body is stored as a flat file in a directory, but the header and threading information is stored in an RDBMS.

Where this really comes into its own is when you store the headers in an RDBMS on a SAN, and keep the file refs for the bodies on a NAS system. It scales wonderfully huge!

If the data that would be stored in the database has few relationships, then you wouldn't necessarily have to use a fullblown RDBMS; you could simply use a collection of flat files, or a simpler DB, like .dbm files. That way, you could exploit things like tie; Tie::File for files, and even DBI for more complex structures.

Thoughts,
-v
"Perl. There is no substitute."
  • Comment on Re: Help, RDBMS have taken over my brain!

Replies are listed 'Best First'.
Re^2: Help, RDBMS have taken over my brain!
by Beechbone (Friar) on Nov 24, 2004 at 13:24 UTC
    I always wanted to code a forum that stores the bodies in a table like

    PartID, MessageID, ThreadID, BodyPart VARCHAR(254)

    so one could select a message with:

    SELECT BodyPart FROM Body WHERE MessageId=? ORDER BY PartID

    and all messages of a thread with:

    SELECT MessageID, BodyPart FROM Body WHERE ThreadId=? ORDER BY MessageID, PartID

    and get a preview of a message with:

    SELECT BodyPart FROM Body WHERE MessageId=? where PartID=0

    Fulltext search with preview:

    SELECT MessageID, ThreadID, BodyPart FROM Body WHERE BodyPart LIKE '%foo%' ORDER BY MessageID

    or, or, or... :-)


    Search, Ask, Know
      That's all good (well, modulo a minor normalization quibble), except the full text search:
      select ... from ... where BodyPart like '%foo%' ...
      On a small dataset this will be OK, but it won't scale, as it requires a table scan (or at the very least a full index scan if BodyPart is indexed) for databases that use B-Tree (and derived) indexes due to the leading wild-card (%).

      Michael

        better (mysql at least) putting a fulltext index on BodyPart and use:
        MATCH(BodyPart) AGAINST('foo')