I have a text file that looks like this:
record_id, datum_1, datum_2, ... , datum_30;
where record_id is integer, and 99% of datums are integers, although about 1% may be text or decimals. Datums can be null. There are 1.5-million records.
Then I have a collection of about 35,000 patterns I have to search for. That is, find all records that have, for example, datum_1 = x and datum_8 = y and datum_20 = z, regardless of what might be in other columns. A single record may contain several patterns, so each line has to be searched for each pattern
I realize this is just mimicing the functionality of a database, (select record_id from theTable where datum_1 = x and datum_8 = y and datum_20 = z) but I was wondering if there's a very efficient way of doing this directly on the file without setting up a database and without scanning 1.5-million lines 35,000 times (I wonder how long 50-billion line scans would take?). I've thought about this most of the day and come up with nothing promising....
In reply to Best way to find patterns in csv file? by punch_card_don
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |