The major pain with trying to select records using regexes is that you have to try and match the whole record instead of just the fields that you are selecting on, hence your difficulties with specifiying the logical select "anything except this". The second problem is that of having your regex match against data in another part of the record than the field that you are interested in.
By imposing some structure on your data--ie. making the fields in the record fixed length--and matching or rejecting on a field-by-field basis rather than trying to match (or not) a whole record at a time, you greatly simplify the process. This is what you would get by moving your data into a flat file DB and using DBI to perform your queries.
At the very least, you should consider fixing the length of the fields of your records. You could then use substr as an lvalue in conjunction with a regex to greatly simplify the process of your queries. Eg.
if (substr($record, 0, 10) =~ $src_ip_of_interest and substr($record, 10, 10) =~ $dst_ip_of_interest and substr($record, 20, 4) =~ $proto_of_interest and substr($record, 24, 6) !~ $src_port_of_disinterest # etc ... ) { #we found a record that matches the query }
I think that you can see how much this simplifies the regexes involved. Generating conditionals using this form and using eval to execute them would be much simpler than trying to come up with a generic regex generator.
That said, using BerkleyDB or similar in conjunction with DBI::* would be considerably easier to code and probably much quicker in performance.
In reply to Re: I agree, but...
by BrowserUk
in thread Runtime Regexp Generation
by tekkie
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |