Hello, This is my first posting to the keepers of wisdom, so I'll try to keep it brief. I've found that I need to stack a lot of regular expressions in order to force pattern matching to occur first on larger comlpex patterns then on the smaller patterns that they are composed of It seems to me that this is simply greedy matching, with the special circumstance that the largest patterns are made up of optional and obligatory combinations of other patterns that will match at least the minimal pattern. For instance:
$np1="(?:$det|$gen)"; $np2 ="(?:$adj|$num|$conj|$adv|$inf)"; $np3="(?:$np1*\s*($noun)*\s*$np2*\s*($noun)+\s*$adj*)";
used together in the following manner: $NP = "(?:(?:$np1)*\s*$np2*(?:$np3)+)"; As I've mentioned, I want to match the longest patterns first but allow for matching on the smaller patterns, which is my reason for including Kleene stars for optional subpatterns. The problem that I'm having is that the optionality leads to matching the minimal patterns and never the optionally longer ones. My question then is whether I need to do as I am now doing, and matching the longest patterns, or the next longest, and so on down to the minimal patterns ? I ask because the OR grouping from greatest coverage to least seems to also be missing longer patterns. So to sum, I need to match long patterns composed of smaller patterns where the long ones match first, then failing that, the long ones match. If my question is overly simple, or my discussion of it unclear, I apologize in advance Thanks,

In reply to greedy and lazy by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.