Trying to do database operations on that scale, over any sort of WAN link (slow or otherwise) is a tad risky and sub-optimal, I think. Can't you find a way to (s)ftp the TSV file(s) to the database host, then use the database server's native bulk-loader tool on that host to finish the job?
(update: for that matter, might be faster/easier to put the 50 GB on a USB or firewire disk and fedex it to remote host... )
If you really do need to do all those row inserts over the WAN link, and if you are confident about knowing the difference between success and failure for each insert, then my first notion would be:
If there's an interruption in the process, just count the lines in the log file (let's say this is "N"), and restart the insertion program like this:
tail +N TSV.file | insertion_script >> log.file
You can simply repeat this as many times as necessary, simply using the appropriate value for N each time (how many lines in the log at present), until the log file has the same line count as the TSV file.
(In case you don't know "tail", it's a basic unix util, which could easily be emulated in perl as follows:
#!/usr/bin/perl
if ( @ARGV and $ARGV[0] =~ /^+(\d+)/ ) {
$bgn = $1;
shift;
}
while (<>) {
print if $. >= $bgn;
}
this just handles the "+N" usage of tail, which is all you need here; there's probably a one-liner form to do the same thing...)
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.