Process the files from other sources into a load format that suits the DBMS you choose. You might have a directory for this named by the download date (most operations go for daily upload from other sources rather than monthly, to avoid delay while processing a big backlog and to stay up to date)
These files should therefore be one file per table, PK columns in their proper order and sorted for the next step:
To calculate an incremental load, use the unix command com command with option -23 for deletes and -13 for inserts (the updates will then be represented as a delete and an insert) e.g. for the deletes:
To build the delete statements, you will need to query the metadictionary of the database to get its PK which has to be matched with the columns retrieved from the above pipe.for ( glob path/YYYYMMDD/* ) { my @split = split( '/' ); my $file = pop @split; open my $com, "com -23 $_ path/$prv_bus_day/$file |"; for my $delrow ( <$com> ) { # build the delete statement from $delrow } close $com; }
One world, one people
In reply to Re: RDB updates: rebuild from scratch vs. incremental
by anonymized user 468275
in thread RDB updates: rebuild from scratch vs. incremental
by tlm
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |