As part of an update to a website that I run, I'm trying to improve the efficiency of the database SQL statements that I'm using. The db is composed of several tables; the main one that will be searched against most often contains about 1500 items with a primary key. (Data will be added or removed from this table but only when I do said updates). Users of the site can rate such items in a number of catagories, but as to keep track of what users have rated already and any short comments they had made, I've stuck the ratings into a second table, with the item's id, the user's id, a unique id for the rating which is used to pull comments from a file, and other details. As of this time, each item has about 10 or so ratings, making around 10,000 items in the second table; this can increase dramatically with additional users rating the items on the site.
What I would like to do is to be able to search or sort on the averages or counts of these ratings when I join that table with the primary table, using the item id as the joining element. As I noted in my recent post DBI and JOINs, I found initially that JOIN operations were way too slow on my system, but others suggested indexing the tables may help. With other aspects of this same problem, I did do that and found that complex joins (ones that includes two joins) were occuring nice and fast after indexing. However, I cannot seem to improve the speed on joining the main table with the ratings table. Processing only 50 items on the main table took about a minute of CPU time, compared to less than a second for the other join I mentioned above. I can't seem to specify an index for this ratings table that will improve the method of joining.
Presently, I work around this by having fields in the primary table that stores the average and counts for the ratings, which is updated any time a new rating is entered. It works, but it seems mighty inefficient. And to relate this back to perl, the main table updating could be one of those things that I easy forget to do in one of the CGI scripts and therefore I'd like to have a more automatic method.
Can anyone suggest a possible way to improve the database join methods or indexing to get around this, or another way to store such data? Do note that I need each rating as a unique item as to keep track of whom has already rated as well as for commenting, so aggregating the results from the start won't work. I'm using Mysql as the database server in this case.
In reply to (OT) Improving a SQL JOIN statement by Masem
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |