Alien has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks!

If I have the following code:

my $str = "Test*.Value"; print "$1 $2\n" if($str=~/(.*?)\*(.*?)/);

Why do I get only Test returned from the regex (in $1)? Shouldn't I have $2 containing Value too ?

Replies are listed 'Best First'.
Re: regex question
by NetWallah (Canon) on Sep 01, 2008 at 07:13 UTC
    It also works if you make the second capture "greedy":
    print "$1 $2\n" if($str=~/(.*?)\*(.*)/); # Remove "?" in second capt +ure
    The reason is that "Non greedy" dot-star matches NOTHING, so it prints nothing.

         Have you been high today? I see the nuns are gay! My brother yelled to me...I love you inside Ed - Benny Lava, by Buffalax

      But what if I have a string containing "abc*def*ghi" (or any number of "words")? Is there a way to obtain abc,def,ghi without resorting to split ?
        In that case, you can use m//g in list context:
        my $s = 'abc*def*ghi'; my @words = ($s =~ /([^*]+)/g);
        You can't have a dynamical number of captures in Perl 5, which is why you either need split or a regex with the /g modifier in a while loop.

        There are hacks, though:

        $_ = "abc*def*ghi"; my @matches; m/(?: ([^*]+) (?{ push @matches, $^N}) \*?)+/x; print join('|', @matches), $/;

        But before you use this, read the big fat warning in perlre about how experimental (?{ ... }) is.

        But what if I have a string containing "abc*def*ghi" (or any number of "words")? Is there a way to obtain abc,def,ghi without resorting to split ?
        Is it possible that you're not describing what you want to do? Normally, it's good if you can use split—it's much faster for this kind of job than pulling the string apart with regexes. According to Benchmark (with all the usual caveats: on my machine, at this time of day, &c.), split /\*/, $string is about 3 times faster than the clever my @matches = $string =~ /([^*]+)/g suggested in Re^3: regex question.
Re: regex question
by broomduster (Priest) on Sep 01, 2008 at 07:13 UTC
    The non-greedy *? is perfectly happy to eat up nothing unless you tell it how far to go. So either use the greedy version for the second capture or anchor with, say, $. Try either of these:

    print "$1 $2\n" if($str=~/(.*?)\*(.*)/);

    print "$1 $2\n" if($str=~/(.*?)\*(.*?)$/);
    Note that both of them produce:
    Test .Value
    Elimination of that . left as an exercise. ;-)
      Thanks for your replies ! I am grateful !
Re: regex question
by lamp (Chaplain) on Sep 01, 2008 at 07:09 UTC
    Hi,
    You have missed to mention \. in regex. Please check the following code.
    my $str = "Test*.Value"; print "$1 $2\n" if($str=~/(.*?)\*\.(.*)/);
    --lamp
Re: regex question
by Anonymous Monk on Sep 01, 2008 at 08:34 UTC