Groll has asked for the wisdom of the Perl Monks concerning the following question:

I have a problem with reading a table in an SQL Server database. A script that used to work quite happily, suddenly stopped returning records. The table in question has over 1,006,000 rows, but even when I get Perl to read the whole table, it returns significantly less than a million records. I can see nothing odd about the records in question, nor found any answers to this problem by googling or scouring the DBI docs.
The following is a simple script (names have been changed for confidentiality) that reads the records from just before the problem point to somewhere after. The last record ID read is 968077.
use strict; use DBI; $|=1; my (@row, $e); my$dbh = DBI->connect('DBI:ODBC:dsnname', undef, undef) or die "Couldn't connect to database: " . $DBI::errstr +; $dbh->{'RaiseError'} = 1; my $sql = qq{SELECT * FROM Transaction where TransactionID=?}; my $sth = $dbh->prepare($sql); foreach my $i (968000..1029829) { $sth->execute($i); if (@row = $sth->fetchrow_array()) { print "$row[0], $row[1], $row[2]\n"; } else { print "***** No record for $i *****\n"; $e = $sth->errstr; print " Error: $e\n"; } $sth->finish(); } $dbh->disconnect(); print "\n$0 finished\n\n";

That's so simple it's unlikely to help. Anyone got any clues as to the problem? Any pointers as to what to try next?
All help gratefully appreciated!
Thanks,
Groll.

Replies are listed 'Best First'.
Re: DBD::ODBC won't read past a certain record
by Corion (Patriarch) on Aug 05, 2003 at 07:08 UTC

    There are two (related) things that I see that could be happening, not knowing your table description :

    How do you know how many rows are there in your database? Did you run SELECT COUNT(*) FROM Transaction ? This would be the only sure way to know how many rows there are in your table.

    How do you know that there are no duplicate IDs? Your script finishes the loop after it has retrieved the first row for a given ID without throwing an error if there are more rows to be retrieved. You should check your column restrictions, add the check to your program, and run some SQL to check that each ID is unique (which is too early in the morning for me to come up with, but someone will surely correct me):

    SELECT DISTINCT a.ID,b.ID FROM Transactions as a,Transactions as b whe +re a.ID = b.ID

    Caveat: Please do not try this command in the production directly, because we at Cartesian Products take our business seriously (or so I fear). I think this command will chew up some serious memory.

    Also, did you check what your maximum ID is?

    SELECT max(ID) FROM Transactions

    Nothing of this should apply if your column definition for ID defines PRIMARY KEY, unless you used some bulk loader command or other external means to initialize your database.

    perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ; # The $d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider ($c = $d->accept())->get_request(); $c->send_response( new #in the HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' # web
      Thanks for your input, but I wrote this in a bit of a hurry, and perhaps didn't write all the information I had. The TransactionID is a primary key, and I can see all the record via both Access and the SQL Server Enterprise Manager. As i said, I know there are well in excess of 1 million records.
      The stinger is that I wrote the equivalent code using Win32::ODBC and got the same results.
Finding missing sequence records in DB table
by gmax (Abbot) on Aug 05, 2003 at 11:41 UTC

    If I understand correctly, the gist of what you are asking is "how can I find which records are missing from a given sequence?"

    If this is true, one method to find out is to use an auxiliary table containing all the sequence numbers, and perform a LEFT JOIN to your table.

    First, create such table and fill it with the values from 1 to the maximum record number you want to consider. What Corion said about maximum record number and unique numbers in Transaction are still valid and you should check them. However, assuming that TransactionID is unique, this method will find the missing rows very fast.

    #!/usr/bin/perl -w use strict; use DBI; my $dbh = DBI->connect("dbi:driver:database", "user","password",{RaiseError=>1}) or die; $dbh->do(qq{ create table sequence (id int not null primary key)) }); my $sth = $dbh->prepare(qq{ insert into sequence values (?) }); $dbh->{autocommit} =0; $dbh->begin; for ( 1.. 1_200_000) { $sth->execute($_); if (($_ % 1000) == 0) { $dbh->commit; $dbh->begin; } } $dbh->commit; my $min = 968000; my $max = 1029829; my $sequence_query = qq{ SELECT a.id FROM sequence a LEFT JOIN Transaction b ON (a.id=b.TranstactionID) WHERE a.id BETWEEN $min AND $max AND b.TranstactionID IS NULL }; $sth = $dbh->prepare($sequence_query); $sth->execute(); while (my ($row) = $sth->fetchrow_array) { print "$row is missing \n"; } $dbh->disconnect();

    HTH

     _  _ _  _  
    (_|| | |(_|><
     _|   
    
      Sorry, but this is not the problem. As I said, there are over 1,006,000 records (all of which I can see via Access, which uses the same ODBC DSN, and Enterprise Manager remotely).
      Many thanks for your input though. My other replies have further details following on from other responses.
Re: DBD::ODBC won't read past a certain record
by mpeppler (Vicar) on Aug 05, 2003 at 09:10 UTC
    When in doubt - check the data.

    I very much doubt that DBI or DBD::ODBC are playing tricks. Instead the most likely issue is that the TransactionID sequence isn't continuous, or has duplicates. On a Unix box I'd do a quick fetch of all the transactionID values into an OS file and then run wc on it to get the number of lines - there are probably similar tools on Windows.

    You can also check for duplicates with

    select TransactionID, count(*) from Transaction group by TransactionID having count(*) > 1
    This will list all the TransactionID values that exist more than once.

    Michael

Re: DBD::ODBC won't read past a certain record
by zby (Vicar) on Aug 05, 2003 at 07:39 UTC
    Not very probable, but the TransactionID's could not compose a continues sequence. I believe you should try a modified SQL statement:
    my $sql = qq{SELECT * FROM Transaction where TransactionID > 968000};
Re: DBD::ODBC won't read past a certain record
by LanceDeeply (Chaplain) on Aug 05, 2003 at 15:31 UTC
    a couple of things to try:
    • start with creating a logfile: DBI->trace(9, 'C:\dbitrace.txt')

    • select only the one column TransactionID
      SELECT TransactionID FROM Transaction where TransactionID=?
      if you get all the rows back, then DBI is having trouble with one of the other columns in your query.

    i suspect you may have a large value in a text column, larger than the default size specified by DBI. the logfile you created should have some interesting info about the problem.

      I agree with LanceDeeply.

      A too-large text column was the first thing I thought of when I saw this node, since I've had the same experience. Couple that with the fact that it used to work but no longer does, and that points to the data as well...

      Here's the code I use to take care of this:

      # # Increase the max size of the text fields retrieved, also prevent us # from quitting when we over-step our bounds. # $dbh->{LongReadLen} = 1000; $dbh->{LongTruncOk} = 1;

      Good luck.

      Pat

        I've come across this too, as had the original author of the script in question. He had set the LongReadLen to 80,000, which is a bit excessive! I've just retried it with both the LongReadLen set to 16,000 and the LongTruncOk set true, for just the 968077 & 968078 records, only selecting the transactionID, which I know exist via Access, with the DBI trace on, as suggested by one of the other replies, still with no joy.
        The trace shows the execute statement for the second record returning with '0E0' (zero but true) instead of -1. No errors are reported.
        We need to get this working really soon for accounting purposes, so I'm currently looking at a rewrite in Java ... Sacrilege!
Re: DBD::ODBC won't read past a certain record
by Anonymous Monk on Aug 05, 2003 at 15:09 UTC
    Maybe its not the code?

    I ran into a problem with mySQL (if that is what you are using) where the max_allowed_packet variable on the server was set to something around 1,048,576 bytes`.

    Although your code is not reflecting this... but it appears you put some debug code together to get the rows after 968000. If the grab before was a whole grab then this would maybe apply, otherwise it will not.

    If the total number of characters in the query with related_column is more than that(the default value of max_allowed_packet, you should split it into smaller parts and try again?

    I did this and set it to 16MB.
    --set-variable=max_allowed_packet=16M
      Thanks, but I don't think this is the problem. As I said above, this is with SQL Server, and I'm only reading one record at a time (I've tried reading just the TransactionID field, as suggested by one of the other responses, with the same result).
Re: DBD::ODBC won't read past a certain record
by JamesNC (Chaplain) on Aug 06, 2003 at 14:20 UTC
    If it is an access DB (sounds like it) have you tried to do a Compact and Repair? I had a similar problem and it fixed it. I also, found that I got confused because I forgot that some records had been deleted and I had Autonumber turned on and wondered why the id didn't match the total number of records returned and it was obvious.. Access remembers the last highest number to auto-increment with... anyway... try the Compact and repair.. couldn't hurt :)
      Sorry, but as stated in the original message, this is SQL Server.
        Sorry, I had read so many of the posters I was focusing on what you said about verifying that the record was there via access which threw me off (it would be helpful to know which version of SQL Server you are running also 7.0?)... If running the same query using ISQL on your host server has the same failure, then you can verify that it is not the DBI or ODBC interface. Other than that, I would also try to see if M$ has updated ODBC drivers, or if you have the latest ones installed on the Server. Before you run off to Java, I would run a simple query using VB and and see if you get the same results...if you do.. then you can really point the finger at the DB. I built a DB with 1.2Million records... (sheese.. it took 30 mins - 1GB in size) and ran the same query... and it returned the records just fine. (Access however - I don't have MS SQL Server at home.) Only stuff I could think of. Good luck. JamesNC