You are still missing too much information to make a decision

  1. What is the maximum posts per minute you expect?
  2. What is the maximum reads per minute?
  3. How many overall posts per day? (estimated growth rate)
  4. How large do you expect the messages to be?
  5. Is it a write once system, or will there be re-editing of messages?
  6. What is your hardware budget for the project, or is there fixed hardware?
  7. What is the required uptime?
  8. Are you going to have an internal search engine?
  9. If so, what sort of information are you going to search on? (metadata, or the message itself?)
  10. What are your disaster recovery requirements?
  11. Do you need to support transactional concurency?
  12. What are your time constraints?
  13. Do you already have a database to use for this purpose?
  14. Do you already have experience with databases?

Moving lots of files around is not a problem. Tar and rsync are your friends. The only problem with files comes when you're trying to work with more files at the same time than your OS supports. Databases for file storage are basically just ways of getting around those problems, and keeping extra metadata catalogs on hand to find the required information in a more efficient manner.

Depending on just what the requirements are, I might go with the file system for the message bodies, and a database for the metadata (posting time, who posted, thread tracking, etc). I might also go with a heirarchical database, rather than a relational database, if that fit well with the anticipated characteristics. I might also look at repurposing an NNTP server, rather than starting from scratch.

Personally, I wouldn't optimize for storage space, unless you're not expecting anyone to read the posts. I'd optimize for reads/writes. Depending on the nature of the forum, I might have an aging system, that moves the entries from a read or write optimized system to an alternate storage mechanism for long-term storage.


In reply to Re: Large chunks of text - database or filesystem? by jhourcle
in thread Large chunks of text - database or filesystem? by TedPride

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.