comment on

Monadic Monks,

I have a text file that looks like this:

record_id, datum_1, datum_2, ... , datum_30;

where record_id is integer, and 99% of datums are integers, although about 1% may be text or decimals. Datums can be null. There are 1.5-million records.

Then I have a collection of about 35,000 patterns I have to search for. That is, find all records that have, for example, datum_1 = x and datum_8 = y and datum_20 = z, regardless of what might be in other columns. A single record may contain several patterns, so each line has to be searched for each pattern

I realize this is just mimicing the functionality of a database, (select record_id from theTable where datum_1 = x and datum_8 = y and datum_20 = z) but I was wondering if there's a very efficient way of doing this directly on the file without setting up a database and without scanning 1.5-million lines 35,000 times (I wonder how long 50-billion line scans would take?). I've thought about this most of the day and come up with nothing promising....

In reply to Best way to find patterns in csv file? by punch_card_don

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.