Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I have a script using DBI to make lots of select queries and print them out on to a html table.

Everytime the script is called, I use the prepare_cache method to execute each select query. Depending on the search results, the # of select queries made will depend.

This HTML table contains several categories, lets say category 1, category 2, category 3, all the way up to 6 categories.

To fill the table with data, each category has its own select query that will be return the data to be printed on.

So my script is a lot of prepare_cache's , and one while { statement that will contain all the $sth-execute($a) commands.

What I'm stuck on is whether to continue using this route and hope for the best if it the site ever grows large. Its potential may be 50,000 viwers viewing this script causing massive select queries being made a day. If I can avoid making this many queries, should I do so? Or a script making this many queries using a prepare_cache method in DBI is normal?

Thanks, Bobby
  • Comment on How many Select Querie is considered too much?

Replies are listed 'Best First'.
Re: How many Select Querie is considered too much?
by imp (Priest) on Aug 10, 2006 at 00:09 UTC
    It would be difficult to give a useful answer with the information provided. What number of queries is currently making you hesitate?

    The number of queries can cause overhead due to the round-trip delay, but this will depend on a number of factors.

    • Location of the database server. Same machine or different
    • Database load
    • Waiting for table/row access when the data is locked for writing.
    But using monolithic queries can also cause overhead.
    • Expensive join clauses
    • Acquiring table/row locks
    • Amount of data being transferred
    If you can provide more information about the way you use the database we can likely offer areas to improve upon.

    But honestly the best way to see which approach works best for you is to benchmark it. Optimizing before you identify the bottlenecks can absorb a huge amount of your development time with very little return on the investment.

    I'm very fond of using Benchmark to determine which algorithm runs fastest, but this is only part of the problem in your case. You really need to see how well it performs when the load is as high as you fear it might become.

Re: How many Select Querie is considered too much?
by Anonymous Monk on Aug 09, 2006 at 23:57 UTC
    Depending on the data you might be able to separate the client request from database queries. A separate script can query the database every few minutes or seconds and save the results to a file (or shared memory). Your CGI script reads that file instead of the database. Then no matter how many requests you get the number of database queries remains constant.

    Try changing this sequence of events:

    request -> database query -> response
    
    Into something like this:
    daemon  -> database query -> cached
    request -> cached value   -> response
    
Re: How many Select Querie is considered too much?
by duckyd (Hermit) on Aug 10, 2006 at 00:15 UTC
    How many is too many depends on specifics you don't provide us. How many queries can you database handle? Do the queries run fast enough to be acceptable from the user's point of view? Do you anticipate needing additional capacity in the near future?

    As far as using prepare_cache, if you are executing the same statment multiple times with the same $sth, it should help, but are you really re-using the same $sth between requests?

Re: How many Select Querie is considered too much?
by Anonymous Monk on Aug 10, 2006 at 00:35 UTC
    Thanks for the replies.
    ,br> imp, thanks for pointing that out. I never learned benchmarking, but i will take it to myself to do so. I guess now is a good time.

    Now that I think about it, most of the queries I am talking about for each category are SELECT COUNT(*) from db where id = ? and date = ? and status = ? type of queries....

    Meaning it only fetches one row each query. However, all select queries are performed on the same table.

    As for the first suggestion by Anon. Monk, the cache method seemed like a very good idea until I realized that this page is dynamic, but needs to act as static. There is a category called "status" where whenever a user receives a message, the status column will print out a "pending" link that will lead to a page that displays the pending messages. The messages aren't meant to be private and are sorta meant to be viewed by the public.

    Thanks, Bobby
Re: How many Select Querie is considered too much?
by Anonymous Monk on Aug 10, 2006 at 00:50 UTC
    To give more of an idea, I jotted down some scripting on how my code will look.
    my ($id, $username); my $sth = $dbh->prepare_cached(<<SQL); Select id, username From members SQL $sth->execute(); $sth->bind_columns(undef, \$id, \$username); #category queries my ($count); my $sthx = $dbh->prepare_cached(<<SQL); SELECT COUNT(*) From Results WHERE user_id = ? AND read = '2' SQL my ($count_v); my $sthv = $dbh->prepare_cached(<<SQL); SELECT COUNT(*) From Results WHERE user_id = ? AND read = '3' SQL my ($count_c); my $sthc = $dbh->prepare_cached(<<SQL); SELECT COUNT(*) From Results WHERE user_id = ? AND read = '4' SQL my ($count_e); my $sthe = $dbh->prepare_cached(<<SQL); SELECT COUNT(*) From Results WHERE user_id = ? AND pending = 'p' SQL my ($count_m); my $sthm = $dbh->prepare_cached(<<SQL); SELECT COUNT(*) From Results WHERE user_id = ? AND requests > 5 SQL my ($count_w); my $sthw = $dbh->prepare_cached(<<SQL); SELECT COUNT(*) From Results WHERE user_id = ? AND votes > 0 SQL # while $sth->fetch() { $sthx->execute($id); $sthx->bind_columns(undef, \$count); $sthx->fetch(); print "<td>$count</td>"; $sthv->execute($id); $sthv->bind_columns(undef, \$count_v); $sthv->fetch(); print "<td>$count_v</td>"; #etc...etc.. until all sth's are executed. #going to add a limit to the # of members to fetch, perhaps #100 membe +rs a page....if thats the case: # #1 select query to fetch all 100 member id's. #6 selects for each member, x 100 members so 600 queries to #print out + the table full of values. #total to 601 select queries } <br><br> Thanks, Bobby
      This set of queries:
      SELECT COUNT(*) From Results WHERE user_id = ? AND read = '2' SELECT COUNT(*) From Results WHERE user_id = ? AND read = '3' SELECT COUNT(*) From Results WHERE user_id = ? AND read = '4'
      Could be replaced with something like this:
      SELECT read, COUNT(*) From Results WHERE user_id = ? AND read in (2,3, +4) GROUP BY read
      Assuming there are entries for those three values you would get back 3 rows, where column 0 is the 'read' value, and column 1 is the number that had the value.

      UPDATE Fixed the query, as TrekNoid pointed out that I omitted the group by clause. Apparently I neglected to select that portion before hitting copy.

        Wouldn't you need a 'group by' in there?
        SELECT read, COUNT(*) FROM Results WHERE user_id = ? AND read in ('2','3','4') GROUP BY read
        I know Oracle really well, so it's possible that I'm just being Oracle-centric :)

        Trek

      Try to avoid nested queries by storing the results from the first query in a hash or array. Then loop over that structure to do more queries. You can also reuse those statement handles and variables. This example avoids the nesting and might be further improved with joins, or a more normalized database.
      my $sth = $dbh->prepare_cached(<<SQL); Select id, username From members SQL $sth->execute(); $sth->bind_columns(\my($id,$username)); my $user; while ($sth->fetch) { $user->{$id} = $username } for my $id (keys %$user) { $sth = $dbh->prepare_cached(q{SELECT COUNT(*) From Results WHERE use +r_id = ? AND read = ?}); $sth->execute($id,2); $sth->bind_columns(\my($count)); $sth->fetch; print $count; $sth = $dbh->prepare_cached(q{SELECT COUNT(*) From Results WHERE use +r_id = ? AND read = ?}); $sth->execute($id,3); $sth->bind_columns(\$count); $sth->fetch; print $count; }