Grrr! I wish they wouldn't do that.

Anticipating more ante upping, with deeply nested brace/bracket combos and wanting to capture a nested (but not an isolated) '{}' or '[]', e.g. '{ {} }', here's (maybe) a bit of a cheat:

#!/usr/bin/env perl -l use strict; use warnings; my ($brace_re, $bracket_re); $brace_re = qr< { (?: [^{}]++ | (??{ $brace_re }) )* } >x; $bracket_re = qr< \[ (?: [^\[\]]++ | (??{ $bracket_re }) )* \] >x; my $re = qr< ( $brace_re | $bracket_re ) >x; while (<DATA>) { print; while (/$re/g) { print "MATCH = $1" if length $1 > 2; } print '-' x 60; } __DATA__ ...?[](...$[] = [ USER_ENTITY_NAME ], text${} = { this is a test })... a[] = [ this is a [ test ] { test2 } ] a{} = { this is a { test } [ test2 ] } { a { b [ {}c{} ] d } e } = [ f [ g { []h[] } i ] j ] {}[]{ {}[] }[]{} - []{}[ []{} ]{}[]

Output:

...?[](...$[] = [ USER_ENTITY_NAME ], text${} = { this is a test })... MATCH = [ USER_ENTITY_NAME ] MATCH = { this is a test } ------------------------------------------------------------ a[] = [ this is a [ test ] { test2 } ] MATCH = [ this is a [ test ] { test2 } ] ------------------------------------------------------------ a{} = { this is a { test } [ test2 ] } MATCH = { this is a { test } [ test2 ] } ------------------------------------------------------------ { a { b [ {}c{} ] d } e } = [ f [ g { []h[] } i ] j ] MATCH = { a { b [ {}c{} ] d } e } MATCH = [ f [ g { []h[] } i ] j ] ------------------------------------------------------------ {}[]{ {}[] }[]{} - []{}[ []{} ]{}[] MATCH = { {}[] } MATCH = [ []{} ] ------------------------------------------------------------

Update: For Perl v5.8, you'll need to change [...]++ to (?> [...]+ ) (the '++' appeared in v5.10) and qr<...> delimiters will need to be something else, e.g. qr!...!.

The '(??{ $re })' construct has been around since at least v5.8.8.

Here's the perlre doco for 5.8.8 and 5.10.0.

-- Ken


In reply to Re^3: Regular expressions: Extracting certain text from a line by kcott
in thread Regular expressions: Extracting certain text from a line by Wcool

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.