in reply to Re: PostgreSQL cursors with Perl.
in thread PostgreSQL cursors with Perl.

About 4 seconds for SELECT COUNT(*) for a single day of data (about 100,000 rows). That's the worst case; query caching seems to be working, so repeated (or even slightly changed) queries of the same type take a lot less time.

Replies are listed 'Best First'.
Re^3: PostgreSQL cursors with Perl.
by jhourcle (Prior) on Apr 17, 2008 at 20:36 UTC

    Although the specifics are probably not worth it now, as you've already moved to a different database, you should _always_ look into tuning your database if you have queries that seem like they're running a bit long.

    I don't know enough about your database, so I don't know exactly what might've been going wrong, but I'd probably look at the following:

    • Check which indexes are / aren't being used. (in mysql and postgres, 'EXPLAIN', in Oracle, 'EXPLAIN PLAN'). For a limited type of queries it can be more efficient to do a full table scan (but not when you're doing a count(*)).
    • Have the database analyze the tables to decide when it should / shouldn't use indexes. (in mysql, 'ANALYZE TABLE', in postgres, 'ANALYZE', in Oracle, 'ANALYZE TABLE' and if your keys aren't evenly distributed, use 'dbms_stats.gather_table_stats')
    • Check to make sure that you have enough memory allocated to the program to keep the necessary indexes pinned in memory and reduce disk IO.
    • Reduce / eliminate the complexity of table joins if possible. (or force a specific type of table join as appropriate for the situation)
Re^3: PostgreSQL cursors with Perl.
by andreas1234567 (Vicar) on Apr 18, 2008 at 06:00 UTC
    I often hear people say that MySQL is "slow". Which often translates to sub-optimized or under-powered. Consider the following real-world example which shows that MySQL works fine with enormous data sets (Yes that almost 1 billion rows in a single table having 41 columns, 1 primary key, 1 unique key, 9 non-unique keys):
    mysql> select count(*) from t_prod; +-----------+ | count(*) | +-----------+ | 952174654 | +-----------+ 1 row in set (0.00 sec) # ls -lh /var/lib/mysql/prod | grep t_prod -rw-rw---- 1 mysql mysql 11K Jun 4 2007 t_prod.frm -rw-rw---- 1 mysql mysql 222G Apr 18 07:35 t_prod.MYD -rw-rw---- 1 mysql mysql 102G Apr 18 07:35 t_prod.MYI
    And I can tell up operations on that table are very fast as long as the indices are used properly.

    Maintenance, however, is a nightmare. For example:

    • When there's a power outage or voltage flux and that 222G table's indices become corrupt and needs to be rebuilt. During which the table is locked and inaccessible.
    • When one needs to reduce the data set and the delete operation takes hours. During which the table is locked and inaccessible. (Of course, this can be mitigated by using partition tables).
    --
    When you earnestly believe you can compensate for a lack of skill by doubling your efforts, there's no end to what you can't do. [1]