comment on

Although the updates should be performed incrementally, they need to be calculated in bulk. The approach I usually take to this is as follows:

Process the files from other sources into a load format that suits the DBMS you choose. You might have a directory for this named by the download date (most operations go for daily upload from other sources rather than monthly, to avoid delay while processing a big backlog and to stay up to date)

These files should therefore be one file per table, PK columns in their proper order and sorted for the next step:

To calculate an incremental load, use the unix command com command with option -23 for deletes and -13 for inserts (the updates will then be represented as a delete and an insert) e.g. for the deletes:

for ( glob path/YYYYMMDD/* ) {
    my @split = split( '/' );
    my $file = pop @split;
    open my $com, "com -23 $_ path/$prv_bus_day/$file |";
    for my $delrow ( <$com> ) {
        # build the delete statement from $delrow
    }
    close $com;
}
[download]

To build the delete statements, you will need to query the metadictionary of the database to get its PK which has to be matched with the columns retrieved from the above pipe.

One world, one people

In reply to Re: RDB updates: rebuild from scratch vs. incremental by anonymized user 468275
in thread RDB updates: rebuild from scratch vs. incremental by tlm

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.