ibm1620 has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

I'm trying to capture and remove certain substrings from a string. In this example I want to capture and remove words starting with 'foo', preserving the residual string.

#!/usr/bin/env perl use v5.36; my $x = "xyzzy foo1 foo2"; my @captured; while ($x =~ s{ \s* (foo\w) \s* }{}msxg) { push @captured, $1; } say "Captured: '$_'" for @captured; say "Left with '$x'";
This gives:
Captured: 'foo2' Left with 'xyzzy'
Removing the 'g' option gives me what I want:
Captured: 'foo1' Captured: 'foo2' Left with 'xyzzy'
But I'm not clear why the regex 'g' option in a while-loop behaves differently for matching than for substitution.

Also, I'm wondering if there's a better way to capture and remove a repeated pattern from a string like the above.

Replies are listed 'Best First'.
Re: while-loop regex substitution with 'g' option
by ikegami (Patriarch) on Apr 21, 2024 at 20:58 UTC
    "g" always means "all the matches".

    With s///g, it performs all the substitutions.

    With m//g, it normally returns all the matches.

    This is very similar.

    I said "normally" because m//g in scalar context can't return all the matches without external help. That's why a loop needs to be introduced.

    You're looking for

    $x =~ s{ \s* (foo\w) \s* }{ push @captured, $1; "" }xeg;
      Beautiful. Thanks.
Re: while-loop regex substitution with 'g' option
by tybalt89 (Monsignor) on Apr 22, 2024 at 00:31 UTC

    Just for fun, here's a case where a loop is still needed for s///g

    #!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11159009 use warnings; use v5.36; #my $x = "xyzzy foo1 foo2"; my $x = "xxxxxyyyyyyyyxxxyyy"; my @captured; while ($x =~ s{ (xy) }{}msxg) { push @captured, $1; say "To Go '$x'"; } say "Captured: '$_'" for @captured; say "Left with '$x'";

    Outputs:

    To Go 'xxxxyyyyyyyxxyy' To Go 'xxxyyyyyyxy' To Go 'xxyyyyy' To Go 'xyyyy' To Go 'yyy' Captured: 'xy' Captured: 'xy' Captured: 'xy' Captured: 'xy' Captured: 'xy' Left with 'yyy'
      Right. When the substitution can produce new matches, you need to restart the scan from the beginning. Strictly speaking, then, is /g needed? hm.
Re: while-loop regex substitution with 'g' option
by LanX (Saint) on Apr 21, 2024 at 19:13 UTC

    > differently for matching than for substitution.

    I'd say s///g always acts like m//g in list context (all in one go) and has no special single step after pos mode in scalar context.

    See perlretut

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    see Wikisyntax for the Monastery