in reply to Load CSV file into Database

You are preparing the same statement over and over. Use placeholders instead.

If you pass DBI the RaiseError flag on connecting, you can avoid checking the success of each DBI operation manually.

You also read the entire file into an array, then process it element by element. Read as you go instead.

While we're at it, don't open the file yourself at all. Using the diamond operator to read from the files in @ARGV will handle errors automatically and lets you handle multiple files at no extract cost.

There's also no provision in your script to pass the connection parameters in from the outside, so you have to change the source for every database it is supposed to run against. Look into Getopt::Long.

#!/usr/bin/perl -w use strict; use DBI; use Text::CSV; use Getopt::Long; use constant USAGE => "usage: $0 [ -h hostname ] [ -p port ] --db dbna +me -u user -p passwd -t table file.csv [ file.csv ... ]"; GetOptions( 'h|host=s' => \( my $opt_host = "localhost" ), 'p|port=i' => \( my $opt_port = "3306" ), 'db=s' => \( my $opt_dbname ), 'u|user=s' => \( my $opt_user ), 'p|pass=s' => \( my $opt_pass ), 't|table=s' => \( my $opt_table ), ) or die USAGE; die USAGE if not( $opt_dbname and $opt_user and $opt_pass and $opt_table ); my $dbh = DBI->connect( "dbi:mysql:dbname=$opt_dbname;host=$opt_host;port=$opt_port", $opt_user, $opt_pass, { RaiseError => 1, PrintError => 0 } ) or die "Connection error: $DBI::errstr"; my $csv = Text::CSV->new(); my $sth = do { $csv->parse( scalar <> ); # first line my @field = $csv->fields(); my $sql = "INSERT INTO $opt_table (" . join( ',', @field ) . ") VALUES (" . join( ',', ( '?' ) x @field ) . ")"; print STDERR "Using query: $sql\n"; $dbh->prepare( $sql ); }; # # broken # while( <> ) { # $csv->parse( $_ ) or next; # $sth->execute( $csv->fields ); # } my ( $buff, $lnr ); while( <> ) { $buff .= $_; if( $csv->parse( $buff ) ) { $sth->execute( $csv->fields ); $buff = ''; $lnr = $. + 1; } } die "Error on line $lnr\n" if length $buff;

Untested.

Update: fixed code as per tilly's example.

Makeshifts last the longest.

Replies are listed 'Best First'.
Re^2: Load CSV file into Database
by tilly (Archbishop) on Jan 31, 2005 at 09:26 UTC
    Both your solution and the previous solution will silently omit lines with embedded returns. :-(

      Yeah, that is a limitation of Text::CSV. I chose to stick with the OP's choice of module anyway, but of course I should have said something about that.

      Makeshifts last the longest.

        It is not an insurmountable limitation. All that you have to do is, every time the line doesn't parse, accumulate it into a buffer, and then prepend it to the next line. That will handle embedded returns.

        There is no reason that this piece of logic couldn't be built into Text::CSV. You could even name it getline to mirror what Text::CSV_XS does. But it isn't there so people tend to get this wrong.

        Incidentally here it is, (untested):

        sub Text::CSV::getline { my $self = shift; my $fh = shift; my $line = ""; my $line_no; while (<$fh>) { $line_no ||= $.; $line .= $_; return if $self->parse($line); } if ($line) { die "CSV parse error at line $line_no"; } }
Re^2: Load CSV file into Database
by rdfield (Priest) on Jan 31, 2005 at 09:26 UTC
    For anything other than MySQL, you'll probably want to commit every so often, and also at the end.

    rdfield