I just picked up the new second edition of Mastering Regular Expressions and was marveling at how much regex support exists in other languages, features I didn't know existed, and - boggle of boggles - something java did better(?) than perl.

Particularly, I looked at what he dubbed 'atomic grouping' with great interest. Perl appears to support it as the (?>) operator. Java supports it in a shorter manner by just adding a + onto another quantifier (++ *+ ?+ {m,n}+) and calls it the 'possessive quantifier'. For those of you who, like me, had never run into this critter before, the simple explanation is this:

Possessive quantifiers take the greedy quantifiers and tell them to never let go of anything they've grabbed. Potentially, they can make matches faster by eliminating unnecessary backtracking.

I lit on atomic grouping as a useful feature, found it documented logically yet rather quietly as 'match nonbacktracking subpattern' in the Camel 3. The possessive extra '+' seemed like a convenient shortening for many purposes. Many questions have sprouted in my mind:

  1. Are Those Who Make The Changes plan on adding the possessive quantifiers (++ *+ ?+ {m,n}+) to the regex engine in 5.10?
  2. To give this node an additional purpose, does anyone have some particularly good examples of when they have used possessive quantifiers or (?>) in Real Life (tm)?
  3. Lastly, does anyone know how to cure the desire to treat the above "(tm)?" as zero-or-one occurences of the string 'tm', captured? I think I've got REs on the brain. :-)

Edit: Fixed broken link. larsen


In reply to Possessive Quantifiers by Ferret

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.