Re: Multiple queries on DBI corresponding to multiple csv files?

The basic idea appears to be straight-forward, but there are some complications. I wouldn't use a hash representation for this application.

Some untested code follows...
I would use {RaiseError => 1} in the DBI connect. That way you do not have to check with "or die" for every prepare or whatever. A "die" will happen automatically.

The big problem here is that writing a CSV file can be devilishly tricky when there are imbedded "," in the data field! Like 'James,Bob,Jr' or whatever. I think (if I remember right) that also a string should be in quotes if there are embedded spaces, "Jessie Jones". The full CSV spec is more complicated than you might think. There are also "bad" implementations that don't meet the spec. The Perl CSV module pretty much "comes up with the ball", even for not quite right input. Generating a proper CSV output is easier than reading a "supposed CSV" input, however there is more than one way to screw this up.

Mileage varies, but often I use the pipe "|" character instead of a comma "," for writing a simple CSV file. That way I don't have to worry about using the CSV module. I only do that for files which I privately consume - not for export to other programs. However, I actually receive multi-million line address files that are delimited that way. "Comma Separated Value" is actually a bit of a misnomer. Any character can be used. In Excel, you just say "hey this is a pipe delimited file" and away you go albeit with a few extra mouse clicks required.

If you really want a "solid CSV file", using an actual comma for the field delimiter, I would use the Perl CSV module.
Anyway, I think something like this is similar to your code.

## untested ##

my @tables = qw (tb1 tb2 tb3);

# DBI CONNECTION
my $dbh  = DBI->connect("dbi:Ingres:$dbname","$user","",{RaiseError =>
+ 1})
           or die "Could not connect to database $dbname\n";
           
foreach my $table (@tables)
{
   my $get_table = $dbh->prepare("SELECT * FROM $table");
   $get_table->execute();
   my $table_ref = $get_table -> fetchall_arrayref;
  
   open my $file, ">", "CSV$table.csv" or die "unable to open CSV$tabl
+e.csv for writing $!";
   print $file join(",", @{$table_ref->{NAME}}),"\n"; #show header - I
+'m not sure this works.
   print $file join(',',@$_),"\n" for @$table_ref;
   close $file;
}
[download]

Comment on Re: Multiple queries on DBI corresponding to multiple csv files? Download Code

Replies are listed 'Best First'.
Re^2: Multiple queries on DBI corresponding to multiple csv files? by jtech (Sexton) on Feb 20, 2019 at 14:32 UTC
Appreciate the tips! I have started to look in the DBI error handle, really cool. I.e.: `my %attr = ( AutoCommit => 0, # Require explicit commit or rollback. PrintError => 1, ShowErrorStatement => 0, RaiseError => 0, HandleError => \&Usage, );` [download] On my TODO list: replace "," for "\|" replace Text::CSV_XS for Perl CSV module Cheers!	[reply] [d/l]
Re^3: Multiple queries on DBI corresponding to multiple csv files? by Marshall (Canon) on Feb 22, 2019 at 00:45 UTC
Happy to see that you are investigating some of these other options! Normally, you don't have to fiddle with "AutoCommit" as the default is "true", at least it is with SQLite. Basically you want write operations to commit automatically unless you override that behavior explicitly. A commit operation is very expensive - the DB has to do a lot of work to make sure the data is "really there". The DB will have a limit on the number of transactions per second (basically commits). The number of operations per second will be more than an order of magnitude more than that. One reason to manually control the start/conclusion of a transaction is when doing a lot of inserts at once. For example: `my $addRoster = $dbh->prepare ("INSERT INTO Roster (Call,Owner) VALUES (?,?)"); $dbh->do("BEGIN"); # start a new transaction for all inserts foreach my $row (@$info) #the 'roster' sheet of Roster.xls #20 Feb 2 +018 { ...blah...blah... $addRoster -> execute ($callsign, $owner); } $dbh->do("COMMIT"); #massive speed up by making one transaction!!` [download] The above code slams data from an Excel spreadsheet into a DB. The sequence is typical. An SQL statement is prepared outside the loop and then executed many times with different data values. An SQL prepare statement is table specific, so this can't be done with a table name as a variable. The above code has been in production for a year. It takes <1 second to run. If I take out the BEGIN and COMMIT statements, it takes ~10 seconds -> 10x difference. Without the BEGIN and COMMIT statements, each insert would be committed automatically. You can run 1 million inserts as a single transaction and I have code that does that. This can make the difference between say 20 seconds and 4-6 hours!	[reply] [d/l]