Nevertheless, if the ASCII data MAY contain duplicates, you have to do something about it. Ny general approach in designing a script for this type of requirement is rather different:

1) keeps sorted backup of the flat input files used per run

2) use the unix comm utility (twice per file) to create a forward and backward difference extracts based on the latest ASCII file and the previous one

3) compare with the previous run and execute batched deletion SQL from the backward differences

4) and then for the insert case, take advantage of MySql's bulk loader, using Perl to translate the flat format only for the "insert" case.

This means that updates are translated into a delete plus an insert per case, but overall in this context that performs better anyway. Unchanged records in the flat file are filtered out by comm. This package of measures overall performs many times faster than the more "obvious" per line design and was proved to be the most reliable.

Update: this is assuming there are no manual changes made in the meantime, in which case you have to back-generate an updated version of the "previous" (now simulated) ASCII file from your own data rather than use the one from the previous run.

__________________________________________________________________________________

^M Free your mind!


In reply to Re: Perl DBI adding some, not all, records to MySQL database by Moron
in thread Perl DBI adding some, not all, records to MySQL database by ericlee

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.