JayBee has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to read and update and insert items all from a CSV file I'm reading from, but the process stops after ~450-490 changes are made. I have 5,000+ records and I've even removed the top 400 to see if it was possibly a bad character/field, but still I get the same failing results. I'm not sure if the Database has a time limit for connections or a limit on processes (which I doubt). I've used print $dbh->{mysql_error}, br, $dbh->{mysql_errno} during the Terminate sub call within the Execute and SQL subroutines, but I get 0, which supposedly means all is good, but the program ends early and says I've been disconnected. Here's my the main subroutine:
use Text::CSV::Simple; my $parser = Text::CSV::Simple->new; $parser->want_fields(0, 1, 2, 4); my @CSV_File = $parser->read_file($_[0]); &SQL('connect'); for my $row(@CSV_File) { my @CSV_Row = @$row; my ($FN, $LN, $SSN, $Title) = @CSV_Row; my ($do1, $do2); $Check="SELECT id, lastname, firstname, ssn FROM member WHERE UPPER(lastname) = '$LN' AND UPPER(firstname) = '$FN')"; # Find m +atching member &Execute($Check, 1); while (my ($id, $ln, $fn, $sn)=$sth->fetchrow_array()) { $do1="INSERT DELAYED INTO reports (lastname, firstname, title, ID) VALUES ('$ln', '$fn', '$Title', '$id')"; $do2="UPDATE LOW_PRIORITY member SET working = 'Y' WHERE (ID = '$id') LIMIT 1"; if ($SSN ne $sn) { &LogEvent("Fail SSN: $SSN for $id"); # Compare SSN +for exact match } else { &Execute($do1); &Execute($do2); } &LogEvent("Success"); } # END WHILE sub SQL { $dbh = DBI->connect("DBI:mysql:DBname","$un","$pw") || Terminate('Sorry, Disconnecting'); # Prints error . $_[0] } # END SQL sub Execute { if ($_[1] == 1) { $sth = $dbh->prepare( $_[0] ) || &Terminate('Sorry, not Preparing'); $sth->execute || &Terminate('Not Executing'); } else { $dbh->do( $_[0] ) || &Terminate('No Do'); warn( $DBI::errstr ) if ( $DBI::err ); $rc=$dbh->disconnect(); } # END Execute
Thanks in advance for your help.

Replies are listed 'Best First'.
Re: mySQL Times Out / Disconnects
by graff (Chancellor) on Sep 30, 2007 at 03:02 UTC
    Maybe you just made a copy/paste error when you posted your code, but I don't see a closing curly brace to mark the end of the for my $row(@CSV_File) loop. Also, it looks like there may be at least one SQL syntax error in your code as posted (extra close paren in the "SELECT" statement).

    Also, you might save some overhead by preparing your sql statements with placeholders. (And if you have names like "O'Toole", using placeholders will save you a lot of grief.)

    If your "Execute" function (wherever that is coming from) doesn't support placeholders, don't use it -- just go right to the standard DBI functions. Here's technique I've used to good effect on several occasions, with hashes to hold the sql statements and statement handles:

    # assuming that the csv file has been opened and read into @CSV_FILE $dbh = DBI->connect("DBI:mysql:DBname","$un","$pw") or Terminate("Sorry, Disconnecting: $DBI::errstr"); my %sql = ( get => "SELECT id, lastname, firstname, ssn FROM member". " WHERE UPPER(lastname) = ? AND UPPER(firstname) = +?", ins => "INSERT DELAYED INTO reports (lastname, firstname, +title, ID)". " VALUES (?,?,?,?)", upd => "UPDATE LOW_PRIORITY member SET working = 'Y'". " WHERE ID = ? LIMIT 1", ); my %sth; $sth{$_} = $dbh->prepare( $sql{$_} ) for ( keys %sql ); for my $csv_row(@CSV_File) { my ($FN, $LN, $SSN, $Title) = @$csv_row; $sth{get}->execute( $LN, $FN ); my $member_rows = $sth{get}->fetchall_arrayref(); for my $db_row ( @$member_rows ) { my ( $id, $ln, $fn, $sn ) = @$db_row; if ($SSN eq $sn) { $sth{ins}->execute( $ln, $fn, $Title, $id ); $sth{upd}->execute( $id ); &LogEvent("Success"); } } }
    (updated to fix indents; updated again to add the missing "for (keys %sql)" to populate the %sth hash.)

    But you can probably retool this even further: the CSV file has three fields for identifying people: first_name, last_name and "ssn". Those three are also in the "members" table, but you select on the basis of first and last name, then reject a row when the ssn doesn't match.

    Why not select using all three fields in the first place?

    get => "SELECT id FROM member where". " ssn = ? and UPPER(lastname) = ? AND UPPER(firstname) = ?",
    If a row comes back, great -- do the update and insert for that person. If not, report that as an error (probably a typo in the csv file, or maybe a row in the members table is wrong/missing; either way, the log message should include the name along with the ssn from the csv file).

    BTW, are you sure the csv file has names in all-upper-case?

    I don't know if these suggestions will help with the problem you are actually having, but if they don't, you should check out whether you really need the "DELAYED" and "LOW_PRIORITY" modifiers.

    You might also look into doing the inserts by writing the rows to a file as tab-delimited lines, and then doing "LOAD DATA LOCAL INFILE" (add LOW_PRIORITY if you want) after you have finished looping over all the csv rows. When you're doing thousands of inserts, that can save a lot of time.

      That missing curly must be a type on this page, but I'm not getting those kind of errors.

      I'm not sure what you mean by placeholders...?

      My Execute sub is either one of two below, I just thought it would be self explanatory:

      sub Execute { if ($_[1] == 1) { $sth = $dbh->prepare( $_[0] ) || &Terminate('Sorry, not Preparing'); $sth->execute || &Terminate('Not Executing'); } else { $dbh->do( $_[0] ) || &Terminate('No Do'); warn( $DBI::errstr ) if ( $DBI::err ); $rc=$dbh->disconnect(); }

      I need to compare SSN to make sure there are no typos on either the CSV or DB tables. There are over 16K members and I actually intend to log the errors found.

      And Yes, the CSV is definitely all Uppercase (I don't know why).

      I used the Delay and Low_Priority when I started giving up and trying other things, unfortunately, I don't see a difference... just left it in since then.

      I was considering the Load Data Local Infile things, but realized my limitation when I needed to compare stuff before the Inserting/Updating happens.

        I'm not sure what you mean by placeholders...?

        Search for that term in the DBI manual, read carefully to understand all the ways they can make your code better, and work it into your code. Basically, you create a query string with one or more question mark characters where a value would be -- each question mark is a "placeholder" for a value. Then you do $sth=$dbh->prepare($statement) in the normal way, and when you execute it, you pass values for each placeholder: $sth->execute(@values) (or $sth->execute($val1,$val2,...)), as shown in the code I posted above.

        I was considering the Load Data Local Infile things, but realized my limitation when I needed to compare stuff before the Inserting/Updating happens.

        I'm not understanding what your "limitation" is. You just open an output file before the loop; within the loop, instead of executing an insert statement, you print a line to the file; after the loop you close the file and execute a "LOAD DATA LOCAL INFILE ..." statement. You can't use this approach for the updates -- those still need to be executed on each loop iteration.

        (updated to improve grammar)

Re: mySQL Times Out / Disconnects
by stonecolddevin (Parson) on Sep 30, 2007 at 01:14 UTC

    You're missing a lot of pertinent code.

    It would help to see what &Execute looked like, and when you're dumping MySQL/DBI errors, it's best to do so like this:

    sub SQL { $dbh = DBI->connect("DBI:mysql:DBname","$un","$pw") || Terminate('Sorry, Disconnecting' . $DBI::errstr); # Prints error . $_ +[0] } # END SQL

    Other than that, this sounds like a time out issue if there aren't any errors being thrown by DBI like you say. Update your node and I'm sure we can solve this problem without too much trouble.

    meh.
      Unless the OP has been tinkering with the MySQL parameters, it seems unlikely to be a time-out issue as --IIRC-- the time-out value of an "out-of-the-box" MySQL is something like 8 hours.

      CountZero

      A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

      I've tried this and it failed without returning anything for the $DBI::errstr at all.
Re: mySQL Times Out / Disconnects
by ww (Archbishop) on Sep 30, 2007 at 01:18 UTC
Re: mySQL Times Out / Disconnects
by NetWallah (Canon) on Sep 30, 2007 at 04:34 UTC
    In addition to program-generated diagnostic messages, you may also want to look at the MYSQL database log.

    In my case, on a Win32 system, in a situation similar to yours, where I was regularly appending records to a DB, and getting errors every few hours, the problem was related to McAfee Antivirus scanning the DB file area.

    After I prevented McAfee from accessing the DB storage area, the errors went away.

         "As you get older three things happen. The first is your memory goes, and I can't remember the other two... " - Sir Norman Wisdom

Re: mySQL Times Out / Disconnects
by jeanluca (Deacon) on Sep 30, 2007 at 09:27 UTC
    Did you try to set
    $dbh->{'mysql_auto_reconnect'} = 1 ;
    That solved for me the MySQL-disconnect problems!

    LuCa
      Yes, Tried this auto_connect as well. I just can't figure it out... It still disconnects. I know it does connect in the first place, because I see the results with phpMyAdmin and Truncate my Reports table from.
Re: mySQL Times Out / Disconnects
by Gangabass (Vicar) on Oct 01, 2007 at 02:39 UTC

    Did you look into MySQL error log? Can you show your Execute sub? Also as i see you did't use placeholder my question is "Why?"

Re: mySQL Times Out / Disconnects
by JayBee (Scribe) on Oct 03, 2007 at 12:23 UTC
    I started Eliminating things for problem solving and after a few things, I found I was using $sth->finish; $dbh->disconnect();

    after the while fetchrow_array() block, which causes some errors. Now I've updated it to:

    $sth->finish if $sth; $dbh->disconnect() if $dbh;

    Which now is giving me an error of "Too many connections" which I looked up:
    http://dev.mysql.com/doc/refman/5.0/en/too-many-connections.html
    and still don't know what to do about other than using module: Apache::DBI according to a super search I made