in reply to Re: Best way to find patterns in csv file?
in thread Best way to find patterns in csv file?

I second the idea of using a database (and DBD::SQLite unless you already have a "real" RDBMS readily available). 1.5 million rows is a lot of stuff after all. A database will pay off especially if you plan to do this query on a regular basis.

Now, if you really want to work on the CSV, I personally would run it through grep first. Suppose you need datum1 = 99 and datum2 = 130 and datum4=5. You can cut down the size of the file by throwing out all lines that do not contain "99" and "130" and "5", which is probably a lot.

grep 99 data.csv | grep 130 | grep 5 > filtered.csv
After this preprocessing, a CSV module from CPAN might be able to handle the rest.

Of course, this simple grep only works if you have AND queries (not OR). But you do, right?

Update: I just saw that you have 35000 patterns to check. Forget the CSV file, use a database (and index the columns that appear in most patterns)