Hello Monks,

I have some doubts about RegEx engine in Perl. The thing is, i know that there are fundamentally 2 RegEx engine NFA and DFA. The source of my info is in below link, i hope its not obsolete yet: master regular expression pdf - Chapter 4. So according to that, Perl is based on NFA engine. But when i check below matching:

$_ = "The first recorded efforts to reach Everest's summit were made b +y British mountaineers "; /summit|Everest|mountain/; print $&; #Result is "Everest"

For NFA the engine will move the control over the RegEx, encounter with the alternation, the engine will check it in turn, so "summit" is checked first and matched. At this point overall match is achieved and the engine should stop but no. The result is most likely from the DFA engine as it will move the control over the target text. First the engine will look into the target string "The first ....", since it find the "Everest" first and the RegEx satisfies that -> Overall match achived! Even if the engine will choose the leftmost match. Then "summit" should be evaluated once? So I try to use a second approach:

$_ = "The first recorded efforts to reach Everest's summit were made b +y British mountaineers "; /summit(?{print "11"})|Everest(?{print "22"})|mountain(?{print "33"})/ +; print $&; #22Everest

The result shows that "summit" is not evaluated! Did i misunderstand something here about the "alteration" or about the RegEx engine? Or Perl is not NFA anymore but Hybrid NFA + DFA? Dear Monks, please enlighten me.

Thanks in advance!


In reply to Is RegEx in Perl not NFA anymore? by redbull2012

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.