in reply to Re^2: join on 6 huge files
in thread join on 6 huge files

The value of "how often is this going to run" cannot be overstated. Along the same lines, how quick does it need to be? If there is a need for speed, I'd start whipping out an RDBMS.

Sorry, but just how just "whipping out an RDBMS" speed up the merging of six flat data files?

Once they are merged, they're merged. There is no point in repeating the process.

If, however, new flatfiles are received on a regular basis and require merging, how would an RDBMS help?

Even if the process(es) producing the flatfiles could be persuades to write the data directly to an RDBMS, just querying the 9,000,000 records from the RDBMS would take longer than merging the flatfiles. Considerably longer.

And that without taking into consideration the time taken to insert the data in the first place. Never mind the cost of amending the (potentially 6) applications to write directly to the RDBMS--if that is even possible.


Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail

Replies are listed 'Best First'.
Re^4: join on 6 huge files
by thor (Priest) on Jun 10, 2004 at 20:07 UTC
    how just "whipping out an RDBMS" speed up the merging of six flat data files?
    I wasn't clear in the original post. I apologize. If the OP is going to do the merge over the same data many times and speed is an issue, then I would suggest a database. It was clear in my head, honest. :)

    thor