To start I would concatenate all the CSV-files together (which can only be done if they all have the same fields in the same order), then with the help of a module like
Text::CSV_XS extract the value of the key field for every record from this CSV-file and save these in a hash (so you have only unique keys and no duplicates).
Next open the big CSV-file again with DBD::CSV and using an SQL statement such as "SELECT * FROM big.csv WHERE keyfield = key_value" replacing key_value by the keys in your hash, you extract the records one by one based upon the value of the keys.
As soon as you extract a record you write it to disk with Text::CSV_XS in another file.
If your csv-files are not too big, you could also try to read each record in each file with Text::CSV_XS and build a hash of arrays keyed by the value of your key-field and then empty the HoA back into a final CSV-file. This is probably faster but less intuitive.
CountZero
"If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.