Re: using stdin with sqlldr
by runrig (Abbot) on Nov 20, 2015 at 22:06 UTC
|
New answer to an old post, but here's an example of loading to a table using (brand new module as of this post) DBIx::BulkUtil. This library can do the traditional "load from a file", or you can create a function to generate and return the rows of data (or pass in your own file handle for sqlldr to read) to be fed to sqlldr through stdin:
my ($dbh, $dbu) = DBIx::BulkUtil->ora_connect(
Database => $database,
User => $user,
Password => $pw,
);
# Assuming two column table
# Default is "|" column delimited, "\n" row delimited
my @rows = qw(
abc|def
ghi|jkl
);
sub insert_row {
my $data = shift @rows or return;
return $data . "\n";
}
$dbu->bcp_in($table, '-', {
Stdin => \&insert_row,
});
$dbh->disconnect();
| [reply] [d/l] |
Re: using stdin with sqlldr
by graff (Chancellor) on Nov 18, 2008 at 02:54 UTC
|
Typically I write each string to a sqlldr file seperated by a delimiter and then when finished I load the entire file. I would like to send each line in my script to sqlldr using stdin. Currently, after I have finshed writing to the sqlldr file I just call the sqlldr command...
What you typically do sounds like the Right Way To Do It. Why do you think it would be better to run sqlldr on one line of data at a time? What benefit do you think this will provide?
I haven't had occasion to use sqlldr for several years now, so I don't know whether it supports reading data from stdin -- and that's not a perl question; you have to look up the docs for sqlldr.
Considering what sqlldr is supposed to do and how it does that, I'd rather have a single file with lots of rows to be inserted in a single run: some rows might fail for various reasons, and sqlldr is very good about handling those, setting them aside, reporting the problems, etc. Having a stable reference for the input data (i.e. a disk file rather than a pipeline stream) might make it easy to do error recovery, diagnosis, etc.
BTW, did you happen to notice these links on the node composition page?
Markup in the Monastery and Writeup Formatting Tips will tell you about the use of <code> tags (short form: <c>...</c>) around snippets of perl code and data -- which is sort of mandatory.
| [reply] |
|
|
Ya loading a flat file with many rows seems to be a good way to use sqlldr. BUT, in some cases we load a file with millions of rows and that still takes sqlldr a moderate amount of time ~ 1-3 hours depending on how much text we insert into clob fields. So we first have to spend the time to write to the flat sqlldr files and then wait while we load them. If I compute the row to load and then use sqlldr to load it using stdin and do this sequentially, that will save me about an hour or two of time off the loading because we don't have to wait for the entire file to load after we are done building it. I would like my application/script to perform in this order: 1. get file with text 2. compute sqlldr values in a string with delimiters 3. load values with sqlldr. Now I would use an insert statement with DBI but we have millions of rows to load at a single time, so an insert statement would take days or weeks with this much data. I'm looking for a way to insert like that only with sqlldr and using STDIN seems like a viable option.
| [reply] |
|
|
...in some cases we load a file with millions of rows and that still takes sqlldr a moderate amount of time ~ 1-3 hours depending on how much text we insert into clob fields.
The loading of that quantity into oracle is not going to take any less time by doing it any other way. I can appreciate a desire to speed things up, and it does seem likely that if you avoid writing that quantity of data to disk, but instead pipe it directly sqlldr, you will save the amount of time that it takes to write to and read back from the disk.
In other words, by streaming data directly into sqlldr, neither the data creation nor the database loading go any faster -- you just eliminate the intermediate delay of waiting till all the data is on disk before loading it.
That said, you might want to use a "tee" in your pipeline: have the data go to a file while it's also going directly to sqlldr, so you have something to look at and work from if anything goes awry.
You are certainly right that using DBI "insert" statements would be orders of magnitude slower.
| [reply] |
Re: using stdin with sqlldr
by cmdrake (Acolyte) on Nov 18, 2008 at 14:44 UTC
|
There doesn't appear to be any support in sqlldr for reading from stdin. However, you can setup a named pipe and use that as your data source for sqlldr.
But... why not just use DBI and INSERT directly from your perl? | [reply] |
|
|
because we typically insert about a million records per load. Doing this with an insert statement would take weeks or months.
| [reply] |
|
|
I tried the "documented" ways to get STDIN working, but it failed. So, I tried named pipes, and got the working on Windows. You could do something very similar on *nixy platforms with mknod/mkfifo
Control file:
LOAD DATA
INFILE '\\.\pipe\sql_pipe\'
REPLACE
INTO TABLE my_table
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
(col1,col2,col3,col4)
Perl:
use strict;
use Win32::Pipe;
$|=1; # autoflush for messages
my $pipe_name = "sql_pipe";
my $pipe = new Win32::Pipe($pipe_name)|| die "Can't Create Named Pipe:
+ $!\n";
open(my $ldr,"|sqlldr.exe a/b\@c control=control.ctl log")
|| die "Unable to execute sqlldr: $!";
$pipe->Connect();
#... do something interesting to get data, and probably loop
$pipe->Write(join(",",$1,$2,$3,$4)."\n");
#...
$pipe->Disconnect();
$pipe->Close();
close $ldr;
Note: the pipe in the open was just a leftover from my first attempt... You could remove that and do something more sensible for launching sqlldr... | [reply] [d/l] [select] |
|
|
|
|
|
|
|
|
| [reply] |