I've tried $var =~ /abc|def|ghi/ but a string such as abdefhi is a false positive. I've also tried /(abc|def|ghi)/ and /(abc)|(def)|(ghi)/ but the aforementioned abdefhi matches all of those.

/abc|def|ghi/ and /(abc)|(def)|(ghi)/ match "abdefhi" because it isn't anchored. Thus it can match "def" anywhere in the string, including in the middle of the string. To force Perl to match "abc","def", or "ghi" to the whole string, one must anchor the regular expression with "^" and "$" (or "\z"). "^" means match just before the first character. "\z" means the end of the string. "$" means match the end of the string or just before the first new-line, whichever comes first.

To add "^" and "$" you must surround "abc|def|ghi" with parenthesis. Either capturing (...) or non-capturing (?:...) may be used. Otherwise Perl will think that "^" belongs only to the first regular expression. For example, in $var =~ /^abc|def|ghi\z/; Perl will think that you are looking for one of three alternatives: "abc" at the beginning of string, "def" anywhere in the string, or "ghi" at the end of the string. By contrast, /^(abc|def|ghi)\z/ and /^(?:abc|def|ghi)\z/ (see post by ikegami) will only look for all three strings (abc, def, ghi) only at the beginning of the string.

In this case non-capturing parenthesis are the better choice. Capturing parenthesis stuff whatever they match inside a variable. But in this case, if the regex matches at all, it matches the whole string so you already have it in a variable.

Hope this explains why the regexs given by kennethk and ikegami do work.

Best, beth

Update - 2009-07-27 - struck out portion below as incorrect or no longer applicable: /abc|def|ghi matches "abc" or "def" or "ghi" anywhere in the string.

"|" only defines alternatives between adjacent regex components, so /abc|def|ghi/ and (abc|def|ghi) both mean match "ab" followed by either c or d followed by "e", followed by either f or g followed by "hi". To get "|" to treat "abc", "def", and "ghi" as alternative whole strings you must surround each string "abc","def", "ghi" with non-capturing regular expression. Non-capturing parenthesis are spelled (?:regex). They tell Perl - treat this sequence of letters as a single regular expression.

You can also surround "abc","def","ghi" with plain parenthesis. Plain parentheseis also group sequences of letters into a single regular expression, but they also "capture" the match and stuff it into a variable.

This is wasteful unless you need to stuff the match into a variable. Even if you do need to stuff the match into a variable, it probably won't do what you expect. Perl will treat each match as a separate variable and populate $1 with "abc" if $var contains "abc" and undef if it doesn't. To stuff whichever of the three happen to match into $1, one needs to surround the whole set of alternatives with a capturing regular expression, like this: ((?:abc)|(?:def)|(?:ghi)).


In reply to Re: regex matching specific strings. by ELISHEVA
in thread regex matching specific strings. by nafion112

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.