in reply to Re^2: transposing and matching large datasets
in thread transposing and matching large datasets

because this kind of problem is what databases are good at.

If the data is already in the database maybe--but only maybe. It depends on whether it is indexed correctly for this particular operation.

If the data is not in a database, then in the time you spend writing the script to load the data into the database, the operation is completed using simple flat file operations. And that's before you actually load the data, index it, and then perform the join and export all the data back to a flat file.

If there is an ongoing need for relational operations upon the dataset, then the costs of importing there may be amortisable over the long term. But to import the data to a db, just to join it and export it all again is a complete nonsense.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."
  • Comment on Re^3: transposing and matching large datasets