in reply to Speed of MySQL DBs

The situation that you describe raises a few questions:

Regarding that last item, it would seem easy enough to take whatever method you are using at present to invent distinct table names, and instead have a single set of tables where that extra chunk of information is stored into the rows that are created on a given run. In other words, instead of this:

run1_tbla: (fld1, fld2, fld3, ...) run1_tblb: (fldx, fldy, fldz, ...) run2_tbla: (fld1, fld2, fld3, ...) run2_tblb: ...
You could just have this:
table_a (run_id, fld1, fld2, fld3, ...) table_b (run_id, fldx, fldy, fldz, ...)
Maybe "run_id" could be something like the date/time of the run, or whatever.

Replies are listed 'Best First'.
Re^2: Speed of MySQL DBs
by rsiedl (Friar) on Feb 06, 2006 at 05:41 UTC
    Hi and thanks for your response,
    I'll try and answer your questions :
    • The script uses id's from a "master index" table to assign a name for each of the new tables created.
    • The script is loading in a lot of data, sometimes millions of records - hence my reasoning in splitting up the tables into smaller ones (and answering your 5th point).
    • The db is not write only. It is queried by other scripts which also uses the id's from the "master index" table to determine which set of tables it should be using.
    • I want to prevent combining the data from each search (also another reason to keep the data in seperate tables).
    • see points above :)
    My whole aim is to try and speed up the mysql side of things. I have achieved this so far by lessening the amount of data in the tables (i.e. instead of one table with 1,000,000 records - 900,000 of which are not needed for the specified query - i've split it up into 10 tables with 100,000 records each).
    Hence that has led me to my initial question - will too many tables slow down the db and if so, would it be quicker to use seperate databases?
    I guess also seperate db's would have the advantage of better security to avoid mixing the search data in any way.
      Given the added detail, I agree that one big table is not as good as a bunch of smaller tables. As to whether a lot of tables is better or worse than a set of distinct databases ... well, there is a whole chapter (7) in the MySQL manual about optimization. Have you looked at that yet?

      I don't know off hand whether mysql's data files are "one per table" or "one per database" -- and I'm not sure if that even makes a difference for performance.

      If you aren't familiar with the science and art of creating indexes on fields that are often used in the "where" clause, maybe it would be more effective to study that before trying "separate databases vs. separate tables in one database". Indexes are pretty easy to add to a running database, and can have a dramatic effect on query response time. (UPDATE: Then again, if you have other good reasons for using separate databases, go ahead -- I doubt it would damage performance at all, and might even help.)

        You realy think multiple tables is better? I'm realy curious as to why and when this would be beneficial. I read several parts of the MySQL documentation on Optimizations but I never realy saw a conclusive line of when this would be good to do. I would have thought that a well indexed single table would always be better then multiple tables.


        ___________
        Eric Hodges
      I think that the multi table approach is a bad one. MySQL's single unique key lookups are really fast, and you can create a single ID by concatenating the keys in sorted order with their values.

      While 10 tables (and thus 30 files in MySQL) is not a lot, it is better to avoid multiple tables.

      Also you have the problem of cache expiry - because you should go through and clear out the expired data. This adds a level of complication where you first have to figure out what tables you have then expire the data in all of those.

      MySQL works very well with large tables - just do an EXPLAIN SELECT with a single unique key and you will see how little work it has to do to return your required results.