in reply to Question on Regular Expression

This is what C programmers call 'undefined behaviour'. There is no point to try to explain why those regexes do what they do (whatever that is).

perlre:

There is a special form of this construct (look-behind), called "\K" (available since Perl 5.10.0), which causes the regex engine to "keep" everything it had matched prior to the "\K" and not include it in $& ($MATCH). This effectively provides variable-length look-behind. The use of "\K" inside of another look-around assertion is allowed, but the behaviour is currently not well defined.

perlvar:

In Perl v5.18 and earlier, it (${^MATCH}) is only guaranteed to return a defined value when the pattern was compiled or executed with the "/p" modifier. In Perl v5.20, the "/p" modifier does nothing, so "${^MATCH}" does the same thing as $MATCH.
The OP is using 5.010.

Replies are listed 'Best First'.
Re^2: Question on Regular Expression
by sjain (Initiate) on Dec 27, 2014 at 18:33 UTC

    I have removed the second "\K". But this does not seem to help.

    Here is what I intened to do

    Lets say I have a string "RC1XY" which has 4 parts and when matched it would be as follows

    Part 1 : RC => captured in P_ROOTCODE

    Part 2 : 1 => captured in DAY1

    Part 3 : X => captured in P_MON_CODE

    Part 4 : Y => captured in P_NEW_MON_CODE

    But if the string is passed as "RS" (instead of "RC1XY"), I was expecting P_ROOTCODE to hold "RS" and rest of the captures (DAY1, P_MON_CODE, P_NEW_MON_CODE) being blank. But even P_ROOTCODE is blank due to this undefined behavior

    Can you please let me know if any other alternative approach to capture different parts when the string (ex :"RS" ) is not matching with the pattern.

    Hope I made clear what is intened and hoping for solution or alternative approach

      Sometimes it's best to start simple with these things. Here's an alternate approach that seems to do what you seem to want done:

      c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le "my @test = qw(RC1XY RS); ;; for my $s (@test) { printf qq{'$s' -> }; my ($p_r, $d1, $p_mc, $p_new_mc) = $s =~ m{ ([[:upper:]]+) (?: (\d+) ([[:upper:]]) ([[:upper:]]))? }xms; dd $p_r, $d1, $p_mc, $p_new_mc; } " 'RC1XY' -> ("RC", 1, "X", "Y") 'RS' -> ("RS", undef, undef, undef)
      How does this match your basic requirements? What further elaborations and sophistications are needed? Do you really need named captures? Etc... (Update: You mention that you're using Perl 5.10, but both these examples, above and below, run the same for me under 5.8.9 and 5.14.4 as well as 5.10.1.)

      Update: I notice you write that you want a "blank" (which I take to be an empty string) to be produced for sub-patterns that do not match. You will note that the example above yields undefined values for non-matching sub-patterns. Since the empty string and undef both have a false boolean value, I find it is usually just as easy to test and deal with undefs as with empty strings and so I usually avoid the extra effort to produce an empty string. If you really need them, here's a possible alternative approach:

      c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le "my @test = qw(RC1XY RS); ;; for my $s (@test) { printf qq{'$s' -> }; my ($p_r, $d1, $p_mc, $p_new_mc) = $s =~ m{ ([[:upper:]]+) (\d*) ([[:upper:]]?) ([[:upper:]]?) }xms; dd $p_r, $d1, $p_mc, $p_new_mc; } " 'RC1XY' -> ("RC", 1, "X", "Y") 'RS' -> ("RS", "", "", "")

        Thanks for approach and I really appreciate your help on this

        Although you showed me the result what I wanted but I needed something more

        I guess I gave you simple example to illustrate the problem I faced. In my earlier example I used the regular expression as '(.*) (0-9) (A-Z) ((A-Z)'

        I ran the code snippet you gave me for 'RW12QW1XY' and it does not work and where as expected out come is as below

        'RW12QW1X' -> ("RW12QW", "1", "X", "")

        Here are more example strings :

        1. Sample1Repeat1A -> ("Sample1Repeat", "1", "A", "")

        2, Sample2Repeat2 -> ("Sample2Repeat", "2", "", "")

        3. Sample3Repeat -> ("Sample3Repeat", "", "", "")

        4. 4SampleRepeat -> ("4SampleRepeat", "", "", "")

        5. 4SampleRepeat4 -> ("4SampleRepeat", "4", "", "")

        6. 5SampleRepeat5D -> ("4SampleRepeat", "5", "D", "")

        I hope these samples help giving more information.