igoryonya has asked for the wisdom of the Perl Monks concerning the following question:

I don't get it:
I have the following:
my $equation = '979x + 87y - 8723z = 274320'; my @parts = ($equation =~ /^(?:(.*?)([xyz]))+/i);
I thought, I would get this result:
@parts = ('979', 'x', ' + 87', 'y', ' - 8723', 'z');
but, instead, I get (only the last match from the string, not all of them):
@parts = (' - 8723', 'z');
What am I missing here?

Update

The reason, I start with ^ and do (?:)+ instead of /g is because I want to slurp the entire string into array elements.
Here is the whole story:
my $equation = '979x + 87y - 8723z = 274320'; my @parts = ($equation =~ /^(?:(.*?)([xyz]))+(.*)=(.*)$/i);
So, instead of getting (which is desirable):
@parts = ('979', 'x', ' + 87', 'y', ' - 8723', 'z', ' ', ' 274320');
I am getting:
@parts = (' - 8723', 'z', ' ', ' 274320');

Replies are listed 'Best First'.
Re: Problem with capturing all matches with regex
by toolic (Bishop) on Oct 12, 2016 at 14:26 UTC

    I changed 3 things:

    • Got rid of ^ anchor
    • Got rid of + at end
    • Added //g for global matching
    use warnings; use strict; use Data::Dumper; my $equation = '979x + 87y - 8723z = 274320'; my @parts = ($equation =~ /(?:(.*?)([xyz]))/ig); print Dumper(\@parts); __END__ $VAR1 = [ '979', 'x', ' + 87', 'y', ' - 8723', 'z' ];

    Tip #9 from the Basic debugging checklist ... YAPE::Regex::Explain

      OK, I've abbreviated my problem @ first, but your answer suggested, that I should tell the whole story, as why I didn't use /g, for example.
      I've updated my original post.
        #!/usr/bin/perl -l # http://perlmonks.org/?node_id=1173839 use strict; use warnings; my $equation = '979x + 87y - 8723z = 274320'; #my @parts = ($equation =~ /^(?:(.*?)([xyz]))+/i); my @parts = grep defined, $equation =~ /(.*?)([xyz])|(.*?)=(.*)/gi; use Data::Dumper; print Dumper \@parts;
Re: Problem with capturing all matches with regex
by choroba (Cardinal) on Oct 12, 2016 at 14:29 UTC
    (...)+ only returns the last match. That's how quantified capture brackets behave. Use /g and while if you want to do something with the pairs:
    #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; my $equation = '979x + 87y - 8723z = 274320'; my @parts; push @parts, "[$1 $2]" while $equation =~ /(.*?)([xyz])\s*/gi; say for @parts;

    Or assing directly to the array:

    my @parts = $equation =~ /(.*?)([xyz])\s*/gi; say for @parts;

    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
      So, this means, I have to do it this way (according to my updated version):
      my $equation = '979x + 87y - 8723z = 274320'; my @parts = ($equation =~ /(.*?)([xyz])/ig); push(@parts, ($equation =~ /\G(.*)=(.*)/));
      Too bad, I thought, I could do it in one swipe.
        ... do it in one swipe.

        See tybalt89's post for a one-swiper. If you don't like the  grep defined, ... before the array assignment and if you have Perl version 5.10+ which supports the  (?| ... | ... ) "branch reset" grouping, there's this:

        c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le "use 5.010; ;; my $equation = '979x + 87y - 8723z = 274320'; my @parts = $equation =~ m{ \G (?| (.*?)([xyz]) | (.*) = (.*) \z) }xm +sig; dd \@parts; " [979, "x", " + 87", "y", " - 8723", "z", " ", " 274320"]

        Update: Please see  "(?|pattern)" in Extended Patterns in perlre.


        Give a man a fish:  <%-{-{-{-<

Re: Problem with capturing all matches with regex
by stevieb (Canon) on Oct 12, 2016 at 14:31 UTC

    You've got a couple of issues there. By making it a bit simpler, and adding the /g (global match) flag, it seems to work.

    The ^ means match only at the beginning of the string, so you'll only get a single match, even with global flag set, and the (?:) non-capture grouping isn't required.

    my $equation = '979x + 87y - 8723z = 274320'; # catch anything that proceeds an x, y or z, and then # catch that letter as well into another capture group my @parts = ($equation =~ /(.*?)([xyz])/ig); print "$_\n" for @parts;

    Output:

    979 x + 87 y - 8723 z
Re: Problem with capturing all matches with regex
by duelafn (Parson) on Oct 12, 2016 at 15:11 UTC

    I was hoping that the named-capture variables would collect all of your matches, but they also only keep the last one, so as an alternative, I can only suggest embedded code (the (?:...)++ ensures no backtracking over your push @lhs code):

    #!/usr/bin/perl use 5.010; use Data::Dumper; my $equation = '979x + 87y - 8723z = 274320'; my @lhs; die "match failed" unless $equation =~ / ^ (?: (?<coeff>.*?) (?<var>[xyz]) (?{ push @lhs, $+{coeff}, $+{var} }) )++ \s* = \s* (?<rhs>.*) $ /ix; my $rhs = $+{rhs}; say Dumper([\@lhs, $rhs]);

    This approach also allows you to construct something more interesting than a flat list, should that be desirable.

    Good Day,
        Dean

      Oh, yea, you have the best solution!
      I wanted to use embedded code, but, since've never done it, couldn't figure out how to use it. Read about on perlre, but didn't understand how to us it, exactly.
      You have a very good, understandable example.
      Now I can optimize a lot of my regexes with it. :)
        I wanted to use embedded code ... Now I can optimize a lot of my regexes with it.

        Beware. Such "optimization" is often a snare and a delusion. Many times it is more conducive to efficiency to understand how the Perl regex engine captures and returns groups and to take advantage of these inherent mechanisms. Also be aware that previous to Perl version 5.18 (I think), there existed a bug that caused weird interactions between embedded regex code and "external" (if that's the right term) my variables.

        OTOH of course, sometimes embedded code is just the only way to do it. :)


        Give a man a fish:  <%-{-{-{-<

Re: Problem with capturing all matches with regex
by dr3ad (Initiate) on May 30, 2022 at 05:43 UTC

    The original question did not request the equation's result (the part after the '=' sign). A solution to that is retained at the bottom of this comment.

    The updated question does (I missed that). In case someone is searching for something similar, where the intent is to get the separate components of the equation into an array, here is one way to capture the parts of the equation into an array without extraneous whitespace in the array values, without attaching operators to operands, and also keeping the '=' sign rather than inferring it.

    my $equation = '979x + 87y - 8723z = 274320'; my $vars = 'xyz'; my $operators = qr/[-+*\/%=]/; my @parts = $equation =~ /([^\s$vars]+|[$vars]|$operators)/g;

    perl -d DB<1> $equation = '979x + 87y - 8723z = 274320' DB<2> $vars='xyz' DB<3> $operators=qr/[-+*\/%=]/ DB<4> x $equation =~ /([^\s$vars]+|[$vars]|$operators)/g 0 979 1 'x' 2 '+' 3 87 4 'y' 5 '-' 6 8723 7 'z' 8 '=' 9 274320

    Here is a solution for the original, non-updated, question that eliminates the equation's result. It captures one of these two strings: 1. Anything that is not an x, y, or z that is followed by an x, y, or z. 2. An x, y, or z.

    Requiring what is captured in (1) to be followed by an x, y, or z prevents ' = 274320' from being captured.

    my @parts = $equation =~ /([^xyz]+(?=[xyz])|[xyz])/g;
    perl -d DB<1> $_='979x + 87y - 8723z = 274320' DB<2> x /([^xyz]+(?=[xyz])|[xyz])/g 0 979 1 'x' 2 ' + 87' 3 'y' 4 ' - 8723' 5 'z'