Re: Load CSV file into Database

You are preparing the same statement over and over. Use placeholders instead.

If you pass DBI the RaiseError flag on connecting, you can avoid checking the success of each DBI operation manually.

You also read the entire file into an array, then process it element by element. Read as you go instead.

While we're at it, don't open the file yourself at all. Using the diamond operator to read from the files in @ARGV will handle errors automatically and lets you handle multiple files at no extract cost.

There's also no provision in your script to pass the connection parameters in from the outside, so you have to change the source for every database it is supposed to run against. Look into Getopt::Long.

#!/usr/bin/perl -w
use strict;
use DBI;
use Text::CSV;
use Getopt::Long;

use constant USAGE => "usage: $0 [ -h hostname ] [ -p port ] --db dbna
+me -u user -p passwd -t table file.csv [ file.csv ... ]";

GetOptions(
    'h|host=s'  => \( my $opt_host = "localhost" ),
    'p|port=i'  => \( my $opt_port = "3306" ),
    'db=s'      => \( my $opt_dbname ),
    'u|user=s'  => \( my $opt_user ),
    'p|pass=s'  => \( my $opt_pass ),
    't|table=s' => \( my $opt_table ),
) or die USAGE;

die USAGE
    if not( $opt_dbname and $opt_user and $opt_pass and $opt_table );

my $dbh = DBI->connect(
    "dbi:mysql:dbname=$opt_dbname;host=$opt_host;port=$opt_port",
    $opt_user,
    $opt_pass,
    { RaiseError => 1, PrintError => 0 }
) or die "Connection error: $DBI::errstr";

my $csv = Text::CSV->new();

my $sth = do {
    $csv->parse( scalar <> ); # first line
    my @field = $csv->fields();
    my $sql =
        "INSERT INTO $opt_table ("
        . join( ',', @field )
        . ") VALUES ("
        . join( ',', ( '?' ) x @field )
        . ")";
    print STDERR "Using query: $sql\n";
    $dbh->prepare( $sql );
};

# # broken
# while( <> ) {
#    $csv->parse( $_ ) or next;
#    $sth->execute( $csv->fields );
# }

my ( $buff, $lnr );

while( <> ) {
   $buff .= $_;
   if( $csv->parse( $buff ) ) {
       $sth->execute( $csv->fields );
       $buff = '';
       $lnr = $. + 1;
   }
}

die "Error on line $lnr\n" if length $buff;
[download]

Untested.

Update: fixed code as per tilly's example.

Makeshifts last the longest.

Comment on Re: Load CSV file into Database Download Code

Replies are listed 'Best First'.
Re^2: Load CSV file into Database by tilly (Archbishop) on Jan 31, 2005 at 09:26 UTC
Both your solution and the previous solution will silently omit lines with embedded returns. :-(	[reply]
Re^3: Load CSV file into Database by Aristotle (Chancellor) on Jan 31, 2005 at 10:12 UTC
Yeah, that is a limitation of Text::CSV. I chose to stick with the OP's choice of module anyway, but of course I should have said something about that. Makeshifts last the longest.	[reply]
Re^4: Load CSV file into Database by tilly (Archbishop) on Feb 01, 2005 at 00:53 UTC
It is not an insurmountable limitation. All that you have to do is, every time the line doesn't parse, accumulate it into a buffer, and then prepend it to the next line. That will handle embedded returns. There is no reason that this piece of logic couldn't be built into Text::CSV. You could even name it getline to mirror what Text::CSV_XS does. But it isn't there so people tend to get this wrong. Incidentally here it is, (untested): `sub Text::CSV::getline { my $self = shift; my $fh = shift; my $line = ""; my $line_no; while (<$fh>) { $line_no \|\|= $.; $line .= $_; return if $self->parse($line); } if ($line) { die "CSV parse error at line $line_no"; } }` [download]	[reply] [d/l]
Re^5: Load CSV file into Database by Aristotle (Chancellor) on Feb 01, 2005 at 04:49 UTC
Re^2: Load CSV file into Database by rdfield (Priest) on Jan 31, 2005 at 09:26 UTC
For anything other than MySQL, you'll probably want to commit every so often, and also at the end. rdfield	[reply]