Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

The code I have here works just fine bringing back the results I am looking for, but the problem is the amount of time that it is taking to process the query. If I put a exit just before the while loop it goes very fast how it should. The problem is when the code is been processing inside the while. Can anyone see some mistakes or some way to better improve this code, or the problem has nothing to do with the code but with the database itself. Another interesting thing is, sometimes the code will go fast maybe because of DB caching the data, but if I live it alone for about 20 minutes it will take forever again. I just would like to see if anyone could spot a problem on the code before I go after the database people. Is there anything perl wise that could be done to improve the speed of the fetching.

Thanks for the help!


#! c:\perl\bin\perl.exe -W use DBI; use strict; use CGI qw(:standard); use CGI::Carp; my $to_all_account; my @array_account; print header(); my $sql = " SELECT MASTERTABLE.ACCOUNT# FROM TEST.MASTERTABLE WHERE (MASTERTABLE.NUMBER LIKE '%1324940%') AND (MASTERTABLE.SERIAL = +'FF' OR MASTERTABLE.SERIAL = 'CC' OR MASTERTABLE.SERIAL = 'OO' OR MAS +TERTABLE.SERIAL = 'NN') "; #Connetcting to an AS400 Database using ODBC, is it secure? my $user = "MYUSER"; my $pass = "MYPASS"; my $dbh = DBI->connect("DBI:ODBC:MYDB",$user, $pass) || print "Connect + to MYDB fail: $!"; my $sth = $dbh->prepare($sql); $sth->execute() || die $sth->errstr; while ( $to_all_account = $sth->fetchrow_arrayref) { print "<br><font color=red><b>^^^^^^^^@{$to_all_account}^^^^^</b>< +/font><br>\n"; push (@array_account,@{$to_all_account}); } print "<br><font color=green><b>**@array_claim**</b></font><br>\n" +; }

Replies are listed 'Best First'.
Re: DB Question, speed is the answer!
by jZed (Prior) on Jan 14, 2005 at 15:11 UTC
    The only thing that stands out as slow in your code is the use of the LIKE predicate - you are performing a full-text search with wildcards and, depending on the number of records and the size of the fields, that could take a while. Yes, you should be using placeholders, but they won't speed up this particular snippet any. I am assuming that your database table is properly indexed, if not, "talk to the database people" :-). It's also possible that it's the database connection itself that is slowing you down, in which case you may want to explore persistence via mod_perl or something. If you have thousands of rows in the resultset, you might gain some speed by using DBI's bind_columns.
      Actually the results are not big at all, it should return around 10 rows with account numbers, that's all.
        I have to agree with jZed here, the predicate:
        MASTERTABLE.NUMBER LIKE '%1324940%'
        is going to perform a full table scan, no matter how many rows your return set has. By wildcarding on the left and right, you're defeating any index that the table would have, so it has no choice but to scan the whole table.

        If it's running faster the second time you run it, it's because the data is likely still cached from your first run.

        You might try re-working this into either two queries, or at least a sub-query, where you do:

        ( MASTERTABLE.SERIAL = 'FF' OR MASTERTABLE.SERIAL = 'CC' OR MASTERTABLE.SERIAL = 'OO' OR MASTERTABLE.SERIAL = 'NN')
        first, which will reduce your initial set, and *then* do the LIKE against that return set.

        That should help some

        TrekNoid

Re: DB Question, speed is the answer!
by dragonchild (Archbishop) on Jan 14, 2005 at 15:02 UTC
    How many rows are you handling? That's going to be the determiner of how fast your while-loop is going ...

    Oh - use placeholders. Good gods, man!

    Being right, does not endow the right to be rude; politeness costs nothing.
    Being unknowing, is not the same as being stupid.
    Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence.
    Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.

      Well, I could use place holders, don't know exactly how many rows, but it should be a lot of rows, could be thousands.
        Ok ... So, you have, say, 5000 rows returned. First off, you really think someone wants to read 5000 rows on a webpage? Secondly, you're going to have to do whatever it is you do within the loop 5000 times. And, since you're constructing strings within that loop, those strings are going to be taking more and more memory. The less memory you have free, the more you have to swap. Once you start swapping memory to disk, you are in a world of hurt.

        I'd try limiting the number of rows you return to no more than 100 and seeing if it helps.

        Being right, does not endow the right to be rude; politeness costs nothing.
        Being unknowing, is not the same as being stupid.
        Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence.
        Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.

        Have you tried the same query in, say, MS Access (or some other client that can use ODBC) using the same ODBC connection? I think you will find that it will take an equally long time to return all the data.

        /J\

      Considering the SQL in the code fragment doesn't use anything variable, what use will placeholders have?

      print "Hello, world\n";
      Oh - use variables. Good gods, man!
        The use of placeholders should not be evaluated on a case-by-case basis, but instead on an overall basis. Understanding where placeholders can safely be avoided requires the same level of knowledge as knowing where 'strict' can be safely avoided. It is telling that nearly every Perl developer who has been lead on at least one project starts every file with use strict;.

        The same goes for placeholders. It is a good habit. The query isn't using anything variable ... today. If you use placeholders, you have future-proofed your query. By not using placeholders, you have introduced a place where bugs can occur and you have to be more vigilant. I don't know about you, but if I can avoid having to be vigilant because a certain class of bugs simply cannot occur, I'm going to.

        As for your example, if the snippet was part of a larger application, I would definitely say in a code review to use variables. There is no reason not to future-proof it.

        Being right, does not endow the right to be rude; politeness costs nothing.
        Being unknowing, is not the same as being stupid.
        Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence.
        Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.

Re: DB Question, speed is the answer!
by runrig (Abbot) on Jan 14, 2005 at 18:05 UTC
    How many rows in the table, and how much does the serial column filter? If the filter on the 'serial' column filters a significant percentage, you might get good results from an index on that column, and maybe changing the filter to mastertable.serial in ('FF', 'CC', 'OO', 'NN' ). Then again, maybe not, because some databases would just convert the 'in' clause to joined 'or' clauses like you have anyway. Getting a query plan would help (do you know how to get one for your database?).

    Also (unrelated), use RaiseError. You are checking the connect and execute, but not the fetch.

      No, I don't know how to get a query plan, can you explain how, thank you!
        I don't know how to get a query plan, can you explain how, thank you!
        It's a database specific question, and I don't know anything about AS400 databases, so, no. You'd have to talk to your database guys or read the database documentation. Though you could answer the other questions and figure out (guess) for yourself whether or not an index might be worthwhile.
Re: DB Question, speed is the answer!
by Anonymous Monk on Jan 14, 2005 at 15:19 UTC
    It's hard to say. It could be that most of the time is spend in the database - if so, you need to reconfigure/redesign your database/table/indices or query.

    But there is one thing in your while loop that could suck up lots of resources, and that's the pushing onto @array_account. If there's a lot to be fetched from the database, @array_account will get huge - and because it grows by bits and pieces, a lot of allocation of memory will happen. That will take time, specially if you're using significant amounts of it. Do you actually need all the data into one array?

      Yes I do, like I mentioned before the results shouldn't be that big, it's an array with a number of possible 10 account numbers in it, that gets into another sql call later in the code.
Re: DB Question, speed is the answer!
by Anonymous Monk on Jan 14, 2005 at 15:37 UTC
    If I look at your query, I suspect the database to be forced to do a tablescan (most likely because of the LIKE). This means that the time spend in the while depends on the number of rows in the table - not just the number of rows returned.

    The fact that there's a huge difference if you run it twice in a row only strenghtens my suspicion.