Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks!
I am working on a script and got stuck trying to understand a line with some regular expression. I am wondering if someone could explain what this line is doing:
${ ($acc =~ /(^USA|BRA|CAN)?(.*?)(XY)?$/) ? $2 : $acc }
Thanks for the help!!!

Replies are listed 'Best First'.
Re: Regular Expression Translation Help!
by kennethk (Abbot) on Nov 30, 2010 at 20:04 UTC
    Using YAPE::Regex::Explain, the code
    use YAPE::Regex::Explain; print YAPE::Regex::Explain->new(qr/(^USA|BRA|CAN)?(.*?)(XY)?$/)->expla +in();
    outputs
    The regular expression: (?-imsx:(^USA|BRA|CAN)?(.*?)(XY)?$) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- ( group and capture to \1 (optional (matching the most amount possible)): ---------------------------------------------------------------------- ^ the beginning of the string ---------------------------------------------------------------------- USA 'USA' ---------------------------------------------------------------------- | OR ---------------------------------------------------------------------- BRA 'BRA' ---------------------------------------------------------------------- | OR ---------------------------------------------------------------------- CAN 'CAN' ---------------------------------------------------------------------- )? end of \1 (NOTE: because you are using a quantifier on this capture, only the LAST repetition of the captured pattern will be stored in \1) ---------------------------------------------------------------------- ( group and capture to \2: ---------------------------------------------------------------------- .*? any character except \n (0 or more times (matching the least amount possible)) ---------------------------------------------------------------------- ) end of \2 ---------------------------------------------------------------------- ( group and capture to \3 (optional (matching the most amount possible)): ---------------------------------------------------------------------- XY 'XY' ---------------------------------------------------------------------- )? end of \3 (NOTE: because you are using a quantifier on this capture, only the LAST repetition of the captured pattern will be stored in \3) ---------------------------------------------------------------------- $ before an optional \n, and the end of the string ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------

    This is then fed into the Conditional Operator (? :), selecting either the second group ((.*?)) if it matched or the whole string if it didn't. Finally, this resultant string is used as a symbolic reference. Good luck. I suspect that the author intended (^USA|BRA|CAN) to actually be ^(USA|BRA|CAN)

Re: Regular Expression Translation Help!
by roboticus (Chancellor) on Nov 30, 2010 at 20:13 UTC

    You already got a nice answer from kennethk. I just wanted to add that you may want to move the caret left one position though: Currently USA must be at the start of the string, but BRA and CAN can be anywhere....

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      Would this be OK or the right thing is to move the caret left outside?
      Could the first way work?
      ${ ($acc =~ /(^USA|^BRA|^CAN)?(.*?)(IN)?$/i) ? $2 : $acc }
      or
      ${ ($acc =~ /^(USA|BRA|CAN)?(.*?)(IN)?$/i) ? $2 : $acc }
        The results in the two cases are identical since ^ is a zero-width match. I personally think the second option is better because it is clearer and has fewer characters, hence is less sensitive to typos, but that is wholly subjective.
Re: Regular Expression Translation Help!
by samarzone (Pilgrim) on Dec 01, 2010 at 09:32 UTC

    This statement simply removes one of "USA", "BRA" or "CAN" from the start of the string and "XY" from the end of string, if found.