assume I have a large collection of strings (let's say a million of them) each associated with a timestamp.
I now want to be able to query this collection for all strings matching a given regex, possibly constrained by upper and/or lower limits on the associated timestamp, so e.g. a query would be "find all strings matching /abc.*/", another one would be "find all strings matching /x*y/ where the associated timestamps are of last week".
Evidently I could put all the data into a database and use SQL for the queries but I wonder if there is a good algorithm to build a suitable index for such queries and do all the querying in pure perl - in such a way of course that answering a query should not take more than a few seconds.
If building an index that supports arbirary regexes is too difficult I could make do with shell-style globbing.
Any ideas?
In reply to Creating an index on a string-collection by morgon
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |