comment on

I really like hv's suggestion, but I don't think he took it far enough, especially since you have additional cases of (?:.*\n)*? later on in that monster.

Let's see if understand the situation here... You have a multi-line string stored as a scalar, and you are trying to capture 11 substrings. Each substring to be captured represents one whole line of text, and except for just a couple minor variations, all these lines to be captured are identical except for the value of a two-digit number, which follows an otherwise identical initial string.

I don't know what you need to do with all those captures once you get the match, but I expect it would not be hard to tweak the subsequent code just a little so that the overall process is a lot more compact and sensible (and doesn't crash).

But then again, since you say you are "getting the regex from a database" and "doing more regexes against more files", the overall system must be a lot more complicated than I would expect, and maybe a simpler strategy would involve a rather large amount of refactoring. (But maybe that wouldn't be such a bad thing?)

Still, I wonder if something like this might be a step in the right direction:

my @captures =
    grep { /^\t\Q.1.3.6.1.4.1.9.2.2.1.1.\E(\d{2}).*?\([A-Z]+\)$/
          and $1 =~ /1[0-7]|2[158]/ } split /\n/;
[download]

It depends on how important it is for the regex match to be as extensive and explicit as you seem to want it to be. In other words, is your regex so long and cumbersome because there's a chance that some of these single-line patterns might occur at various points (out of sequence) throughout the input file, and you must be sure to match the full sequence in proper order?

Or is it long and cumbersome because you just happen to be including all the details that you know about, even when they are redundant and/or unnecessary in terms of assuring correct matches?

(update: removed spurious close paren from code snippet)

In reply to Re: regex causing segmentation fault (core dump) by graff
in thread regex causing segmentation fault (core dump) by Otogi

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.