bradcathey has asked for the wisdom of the Perl Monks concerning the following question:

Fellow Monasterians:

This might fall more into the realm of a MySQL forum, but I'll take a chance here in a familar place.

I wrote a little script that uploads a CSV file and imports it into a table. I do it all the time, but this time it is only importing a few lines. In a larger version of this same CSV file, it imports the first few lines, a couple in the middle, and then the last one.

I've examined it for "gremlins" and find the data clean. Frankly, I'm at a loss. Here's the statement that gets executed, or you can view the whole script below.

$stmt =qq/LOAD DATA LOCAL INFILE "..\/$cvstoupload" INTO TABLE statezones FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY """" LINES TERMINATED BY "\r" IGNORE 1 LINES/;

Whole CSV file:

STATE,ABBREV,ZONE Alabama,AL,2 Alaska,AK,4 Arizona,AZ,3 Arkansas,AR,2 California,CA,3 Colorado,CO,2 Connecticut,CT,2 Delaware,DE,2

All that imports:

Alabama,AL,2 Alaska,AK,4 Arizona,AZ,3

Entire script:

#!/usr/bin/perl -T use lib "/home/gotjunk/www/cgi-bin"; use warnings; use CGI::Carp qw(fatalsToBrowser); use strict; use CGI; use Crypt::CBC; require Common; #------------------------- declare -------------------------- my $query = new CGI; my $dbh = &dbconnect; #open DBI connection in Common #------------------------ grab input ------------------------ my $cvstoupload = &upload_file($query->param('filename')); my $stmt ="DELETE FROM statezones"; #dump the old data &execute_it($stmt); #----------------- open .db file and insert ----------------- $stmt =qq/LOAD DATA LOCAL INFILE "..\/$cvstoupload" INTO TABLE statezones FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY """" LINES TERMINATED BY "\r" IGNORE 1 LINES/; my $sth = $dbh->prepare($stmt); $sth->execute(); unlink "../$cvstoupload" or die "unlink: $!"; $sth->finish(); $dbh->disconnect(); exit(); # ----------------------- upload file ----------------------- sub upload_file { $| = 1; my $sourcefile = shift; my ($buffer, $bytes); $sourcefile =~ /([\w .-]+)$/i; my $newfile = $1; $newfile =~ s/ //; open (OUTFILE, ">../$newfile") or die "Cannot open $newfile: $!"; binmode(OUTFILE); while ($bytes = read($sourcefile, $buffer, 1024)) { print OUTFILE $buffer; } close(OUTFILE) or die "Close: $!"; chmod (0666, ">../$newfile"); return ($newfile); }

Thanks in advance

UPDATE: After testing all conditions and not finding anything in particular, I rebuilt the table and CSV file and got it to work. Thanks to those who responded.


—Brad
"Don't ever take a fence down until you know the reason it was put up." G. K. Chesterton

Replies are listed 'Best First'.
Re: Unable to load entire CSV file into DB
by cchampion (Curate) on Nov 06, 2004 at 17:22 UTC

    Something for you to investigate:

    • Are you sure that the file is uploaded completely?
    • Are you checking for database errors?
    • Have you read Before asking a database related question ...?
    • Did you try that SQL statement outside your script? I mean, manually, from the mysql monitor?
    • Does the table structure correspond to the CVS structure?

    If the answer to all the above questions is "yes", then I am prepared to accept that you might have a Perl problem. Otherwise, I should address you to a a more appropriate place where to ask your question.

    Moreover, you are asking for a database related error, and then you are showing us an uploading script.

    Is that relevant?

    If not, why don't you skim the example to the only thing that matters?

      Thanks for the tips.

      Are you sure that the file is uploaded completely?

      Yes, in all cases, it's all there

      Are you checking for database errors?

      I'm using { RaiseError => 1} and $dbh->trace(2) and no errors.

      Have you read Before asking a database related question ...?

      I have, but just re-read it.

      Did you try that SQL statement outside your script? I mean, manually, +from the mysql monitor?

      Did not. Working on a remote shared web server.

      Does the table structure correspond to the CVS structure?

      Dead on!

      Again, some of the file is importing. It seems like there is some hidden character in the CSV. Per your suggestion, I will ask in a more 'targeted' forum. Thanks!


      —Brad
      "Don't ever take a fence down until you know the reason it was put up." G. K. Chesterton

        I expect the file format will be the issue, quite possibly the line endings. The first simple thing in your upload is to do this:

        $buffer =~ s/[\r\n]+/\n/g;

        to standardise all your line endings. You could use \r if you want but \n is the default, the standard .... and it's less typing in your LOAD DATA.

        If that does not work then I suggest you use perl to read and validate the uploaded file before you pass it to mysql. I too suggest trying to load the file manually. You could download the file and try loading it into a local MySQL DB if you don't have access to the remote one.

        cheers

        tachyon

        Did not. Working on a remote shared web server.

        Don't you think that this very step could tell you in which direction to look?

        I would try to do the testing in a local machine. Installing MySQL is not rocket science and your table does not seem to have a million records.

        So, instead of guessing, why don't you test it?

Re: Unable to load entire CSV file into DB
by Zaxo (Archbishop) on Nov 06, 2004 at 17:46 UTC

    With minimal changes, I think that your LOAD DATA statement should read,

    my $stmt = qq(LOAD DATA LOCAL INFILE '../$cvstoupload' INTO TABLE statezones FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\r' IGNORE 1 LINES);
    That takes care of quoting problems. Is your record seperator really "\r"? That seems odd.

    I think that the relative path will be taken by mysqld to be relative to the directory of the database you're logged into. Use absolute paths.

    I'm not sure how this has worked at all. Regarding the data file, check with a hex dump or something that the record seperators are all consistent and are what you think they are.

    This seems like an odd application to do as CGI. Are you cursed with no shell access to the host? If so, I'd change host - this is an insecure thing to have around.

    After Compline,
    Zaxo

      This seems like an odd application to do as CGI.
      Since I'll be implementing something like this real soon, I'd like to offer a scenario where this is the appropriate solution.

      In a project I'm doing the customer would really like a whiz-bang database interface with GUI, but hasn't quite figured out what they want. They'd like programatic pick-lists of data supplied from sources on other hosts, lots of ease-of-use features, and all those wonder-widgets that make administrative work less painful.

      But the one group hasn't yet figured out their current processes. In other words, the people who want the fine admin tool aren't ready to define that phase of the project.

      But the "funding organization" needs the frontend functionality quickly, before this year's budget runs out! The guy with the money needs something now, and I find myself actually able to propose "What is the simplest thing that could possibly work?" in order to get the frontend process up and running for the 'real' users.

      Until the admin people can define the new admin process, we'll supply them the ability to submit a CSV file, with column name header, as exported from Excel. After they have added/modified/deleted items they'll export and upload the CSV file via CGI. The CGI will perform basic sanity checks and import into the minimal DB driving the initial implementation.

      Web access remains the most readily available and usable interface for admin people to interact with the implementation host. But since their specification is (cough) evolving, forcing them to use Excel/CSV initially will allow the project to progress, will give them practice defining where data items come from (their process), and will annoy them to the point of wanting to "get specific."