After years of silent dedication, the young monk's patience and dedication pays off after his skill is put to the ultimate test.

I've been studying regular expressions quite a bit to try and get the hang of them and have finally gotten to the point where I am truly grokking them. However, despite getting some minor tweaks and efficiencies in my code, I never have really had the need to write a truly complex regular expression. Until now. My training has paid off :)

I am routinely coming up against data like the following:

Std_English,A2,B3|Std_Arts,A2,B6|Std_Cultural,A1,E8
Whenever I encounter a variable containing values like "Cultural" or "English", I've needed to extract that A2,B6 type data from all of those sections delimited by a | or by the end of line. It's not terribly difficult, but I have found that the script in question was running terribly slowly and the regex I was using was inside of a loop that was doing its part in slowing things down. Thus, I was forced to write the most efficient regex I could for this. Here's the result:
$input =~ /^ (?: # Non-capturing parens [^_] # All non-underscores | # or _ # underscore (?! # not followed by ${value} # the current standards ) )+ # 1 or more characters like above _${value}, # Current standard followed by comma ( # Capture to $1 (?: # Non-capturing parens [A-Z]\d{1,2},? # One cap, one or two digits, and an o +ptional comma )+ # Above one or more times ) # End capture .* # Rest of line (okay to be greedy here +) /ix;
Wow! That was a mouthful. Just a couple of months ago when I joined Perlmonks, I never would have dreamed of writing anything like that. Thanks to all of you for your helpfulness and patience.

Cheers,
Ovid

P.S.: To monks less familiar with regex, see my node Death to Dot Star! for an explanation of why the above is efficient.

Replies are listed 'Best First'.
RE: The training of a monk
by coreolyn (Parson) on Aug 12, 2000 at 19:13 UTC

    While the efficiency of the regex may be wonderful, I want to thank you for posting this node simply for the style through which you broke out the functionality of the expression. As I need to share a lot of code at work, I may adopt this style of expression building as it provides clarity and insight as to it's functionality (and might eleviate the number of verbal explanations I need to provide).

    coreolyn Duct tape devotee.

RE: The training of a monk
by Buckaroo Buddha (Scribe) on Aug 14, 2000 at 18:24 UTC

    WOW! this is the best commenting of a regex i've seen
    never even thought of doing it down to that depth
    cool :) thnkas