beanryu has asked for the wisdom of the Perl Monks concerning the following question:

hello there, say I have a file

ABC1D A1D

Say I want to match the following regex: A(BC|)(.*)D which will match either AD.*D or ABC.*D. Now I have the following code:
@matches = ($_ =~ /A(BC|)D/g) if(@matches != 0){ foreach(@matches){ print "$_.\n"; } } } the code prints 1 BC 1
how can i make it not print BC and only print the numbers? Thanx a lot in advance.

Replies are listed 'Best First'.
Re: how to make // not return what is in a parenthesis?
by AnomalousMonk (Archbishop) on Aug 01, 2010 at 01:54 UTC

    One way would be:

    >perl -wMstrict -le "my $s = 'ABC1D A2D D3A foo4bar'; my @matches = $s =~ m{ A (?: BC)? (\d+) D }xmsg; print qq{'$_'} for @matches; " '1' '2'

    Updates:

    1. In common with chromatic, I, also, cannot get the OPed code example to work as posted.
      I am going by the first regex in the OP: 'A(BC|)(.*)D'.
    2. Slightly altered code example to make results clearer.
    3. See the non-capturing grouping expression  (?:pattern) in the Extended Patterns section of perlre.

      thank you so much sir, yes, this is what I am looking for, thanx a lot.
Re: how to make // not return what is in a parenthesis?
by ww (Archbishop) on Aug 01, 2010 at 02:21 UTC
    Your code does not compile:
    1. missing semicolon at line 1
    2. too many closing curly brackets

    And, your code would not produce the output you assert even were those errors corrected. Perhaps the most trivial issue is that your line four would print a dot between the value of $_ and the newline, since you've quoted the entire output, not just the \n. But frankly, I can't even think of a way, off-hand, that any plausible typo in your posting would produce the alleged output

    Based on your comments, I'm guessing that you believed the vertical_bar amounts to an or in your regex. It can represent "or" but not in the manner you've presented it. perldoc perlretut may help you on that.

    An alternate might be that the bar is a typo for the numeral, "1." but even that would never cause the code you've presented to produce the output you posted.

    What you've presented asks the regex engine to match:

    a single character, "A"
    a string, "BC|" (which is to be captured)
    followed by a "D"
          as many times as possible
          (which is none, nil, never.)
    So what did you really mean? What code (if any) actually produced the output you quote?

      I agree that the code given in the OP doesn't make sense, especially with respect to the output shown.

      However, the first regex given in the OP, 'A(BC|)(.*)D', is actually not far from working (insofar as I understand what beanryu really wants to do).

      The  (BC|) group will match either 'BC' or the empty string. Just making this group non-capturing would be a big step forward. (My own preference for expressing this pattern would be (?:BC)? .) The next step would be to make the quantifier in the second (capturing) group lazy: (.*?) . (However, since beanryu seems to want to capture digits, I would rather express this as (\d+) .)

      With just these changes, the regex actually works more or less as beanryu seems to want:

      >perl -wMstrict -le "my $s = 'ABC1D A2D'; my @matches = $s =~ /A(?:BC|)(.*?)D/g; print qq{'$_'} for @matches; " '1' '2'
Re: how to make // not return what is in a parenthesis?
by chromatic (Archbishop) on Aug 01, 2010 at 01:52 UTC

    I can't reproduce your output, but are you looking for something more like:

    while (/A(BC)?D/g) { push @matches, $1 if $1; }
      actually I am just looking for a way to make // not return what is inside a parenthesis when I output its result to an array. Because I have (expr 1|expr 2) in the regex to match, it always out put (expr 1|expr 2) into the array even though I am using the parenthesis only because I have to use the | in there.