TedPride has asked for the wisdom of the Perl Monks concerning the following question:

Let's assume I have a large number of member records that have to be accessed constantly. Let's also assume each member record has to include a bio, which can be anywhere from 0 to quite a large number of characters, will be accessed much less often than the other fields of the record, and will be updated every few days to a week. Am I better off keeping the bio in the main table, where it will likely cause all sorts of gaps and reduce efficiency for the majority of views, or put the bio in a separate table, where I'll have to access two tables instead of one the minority of views where someone wants to view the bio?

Or in other words, how much will having a TEXT field in a table reduce efficiency, assuming I OPTIMIZE TABLE reasonably often? Does the reduction in efficiency only have to do with gaps caused by TEXT contents getting larger and having to move, or is there a reduction in efficiency inherent to TEXT fields with large contents? Are accesses significantly faster if the table is more compact?

Also, to a lesser extent, what's the tradeoff for having indexes? What's the ratio of views to updates that makes an index break even in terms of efficiency? Assume 10,000 records or so.

Unconsidered delete by holli - enough keep votes (5/1/3)

  • Comment on OT: MySQL - TEXT field efficiency? Do I include in table or separate?

Replies are listed 'Best First'.
Re: OT: MySQL - TEXT field efficiency? Do I include in table or separate?
by jhourcle (Prior) on May 29, 2005 at 23:41 UTC

    You haven't mentioned your usage of the database. You've mentioned what sort of information you'd be placing in the database, but not how you'd be retrieving the data.

    If you will typically not be accessing the bio, then it would make sense to store the TEXT field seperately. If you have the extra disk space, you can always use a materialized view to join the two tables, or to create a sub table without the TEXT field. Whatever you do, make sure that the TEXT field is the last field in the table. (you also typically want to place VARCHARs, or any other variable length fields to be at the end)

    As for what's the percentage of reads to writes that makes an index break even, that's another item with no simple answer -- if the reads involve 25% of the records, then the index is most likely never beneficial. You might also have more than one index on a table, so there would be a different point of breaking even on a table with 8 indexes vs. a table with only one index.

    It's also possible that the type of tables might be a factor. I have more experience with Oracle than mySQL, but Oracle uses indexes for record level locking in transactions ... I have no idea if innoDB tables do something similar.

    So, I'll go back to my past comments about tuning -- no one's situation is the same as someone else's. Different amounts of memory or disk contension may change how things behave. Your best bet is to build a prototype, and try to load it how you think it will be in production. If it doesn't meet your requirements, then spend time tuning it. Every so often, you'll want to see if your load has change, and retune.

Re: OT: MySQL - TEXT field efficiency? Do I include in table or separate?
by trammell (Priest) on May 30, 2005 at 01:34 UTC
    From http://dev.mysql.com/doc/mysql/en/data-size.html:
    For MyISAM tables, if you don't have any variable-length columns (VARCHAR, TEXT, or BLOB columns), a fixed-size record format is used. This is faster but unfortunately may waste some space. See Section 14.1.3, “MyISAM Table Storage Formats”. You can hint that you want to have fixed length rows even if you have VARCHAR columns with the CREATE option ROW_FORMAT=fixed.
    HTH, HAND.
Re: OT: MySQL - TEXT field efficiency? Do I include in table or separate?
by Cap'n Steve (Friar) on May 30, 2005 at 05:42 UTC
    I'm assuming that mySQL does a fairly good job of optimizing variable length fields. However, it can't hurt to keep the text fields in another table. What you're looking for is a join, something like this (assuming table1 is the main users table, table2 is the bio table, and textid is a field in both tables that stores the id of the bio):
    SELECT table1.*, table2.* FROM table1 LEFT JOIN table2 ON table1.textid = table2.textid WHERE table1.userid = 1
Re: OT: MySQL - TEXT field efficiency? Do I include in table or separate?
by TedPride (Priest) on May 29, 2005 at 23:54 UTC
    Can you explain what a materialized view is? Is there a way to collect field data from two tables using a single MySQL call?

      A materialized view is essentially a table that's built dynamically. In Oracle, there's a command to do it directly, but I do something similar in mySQL and just rebuild the table as needed. (typically through a cron job that fires each morning, as I deal a lot in data warehouse type systems, which mysql works rather well for) In mySQL you need to specify the view directly, whereas Oracle will do query rewriting if it realizes it can.

      As for getting fields from two tables, yes, you just have to tell mysql how to join them. Here's an example of creating a view, (plain old view, which is basically just a way to alias the table joins and field lists)

      CREATE VIEW myview AS SELECT a.field1, a.field2, a.field3, b.* FROM tablea a LEFT JOIN tableb b USING (id);

      I'd suggest seeing the mysql documentation for:

      There is a complex set of rules about trying to update directly through a view though... depending on your type of join, the database may not be able to figure out which record it actually needs to place the value, and you can end up with odd results. For what was described in the original post, however, it should be possible to update through a view to join the bibliography data, and everything else. (because it's a one-to-one relationship, and not a one-to-many)