comment on

On reflection, there is a way to do this kind of parsing, kind of use, using regexes. I think of it as the "inch along" with negative lookahead strategy described by Merlyn (sort of) at Death to Dot Star. Something like this does what you described you needed above, I believe.

use strict;
use warnings;

my $content = "";
while (<DATA>) {
    $content = $content . $_;
}

#print "content: $content"; # sanity check

while ($content =~m/(
                    CALCON\([^)]*?\)[\r\n]*{[^}]*?}            #entire
+ match. Same as in negative lookahead on next line.
                    ((?!CALCON\([^)]*?\)[\r\n]*{[^}]*?}).)* #inch alon
+g with negative lookahead
                    )/xsmg){
    my $entire_match = $1;
    if ($entire_match =~ /CALCON\((.*?)\)/) {
        my $test_number = $1;
        print "entire match: $entire_match\n";
        print "test number: $test_number\n";
        print "\n\n";
    }
    
    
}
__DATA__

CALCON(test1)
{
  TYPE(U8)
  FEATURE(DCOM)
  NAM(stmin)
  LABEL(Min seperation time between CFs)
  MIN(0)
  MAX(127)
  UNITS(ms)  
}

CALCON(test2)
{
  TYPE(U16)
  FEATURE(DCOM)
  NAM(dcomc_sestmr_timeout)
  LABEL(DCOM Session Timer Timeout)
  MIN(0)
  MAX(65535)
  UNITS(ms)
}

CALCON(test3)
{
  TYPE(U16)
  FEATURE(CALCON)
  NAM(dcomc_sestmr_timeout)
  LABEL(DCOM Session Timer Timeout)
  MIN(CALCON)
  MAX(65535)
  UNITS(ms)
}
[download]

This may be a case of killing a mosquito with a flamethrower, but... well... TIMTOWTDI. Maybe you like it :)

But seriously, an internal rule of thumb for me is that when I start having to inch along, it may be time to stop thinking regexes and start thinking something else.

Disclaimer: this works for your input data, but it makes me a little uneasy. Are there may be edge cases I haven't thought of? That's why the gut still says, uh oh, reach for P::RD.

UPDATE: Replaced the $& with $1 per holli below.

UPDATE 2: Made the "inch ahead" a more thorough, so doesn't fail on "CALCON" in the data area, as in the third test case. Originally this was just

$content =~m/(CALCON((?!CALCON).)* )/xsmg
[download]

In reply to Re: nested reg ex over multiple lines by tphyahoo
in thread nested reg ex over multiple lines by eg8rds

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.