Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Here is my problem ... I need to make substitutions on all occurrences of a pattern which occur between defined markers but nowhere else in the string, for example

my $text = 'An abitrary BEGIN_MARKER string which END_MARKER could contain BEGIN_MARKER just about END_MARKER anything' ;

The markers always occur in pairs, there could be any number of pairs, and the markers themselves must be retained.

How can I make a greedy substitution between BEGIN_MARKER and END_MARKER, but nowhere else?

I guess it should be easy, but I have just spent 4 hours with Mastering Regular Expressions, Perl Cookbook and Programming Perl ... and I still can't figure it out.

The nearest I have got is

$text =~ s/(BEGIN_MARKER.*?)a(.*?END_MARKER)/$1 x $2/g;

intended to substitute 'x' instead of 'i' (that is of course a simplification of what I am really trying to do).

It doesn't work -- it merely finds all marker pairs, but substitutes only the first occurence of 'i' between them.

I don't even know if I am close. Should I be trying to use a backreference? I'm new to all this.

Replies are listed 'Best First'.
Re: Greedy Substitution Within a Defined Range
by BrowserUk (Patriarch) on Apr 21, 2004 at 13:19 UTC

    Do it in two steps.

    my $text =~ s[(?<=BEGIN_MARKER)(.*?)(?=END_MARKER)]{ (my $rep = $1) =~ s[i][x]g; # For single char subs you could use t +r// here $rep; }ge; print $text; An abitrary BEGIN_MARKER strxng whxch END_MARKER could contain BEGIN_M +ARKER just about END_MARKER anything

    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
      Double points off for my ($rep = $1) =~ s/.... That should be two lines. This is how I'd solve the problem, btw. Nested substitutions are a really, really nice thing.

        Many many thanks to those who have given their time to help us. You make it look so easy, but I don't think it is really. We have used diotalevi's nested substitution solution -- it fits our purposes (and is remarkably fast).

        ::: Gratitude :::

Re: Greedy Substitution Within a Defined Range
by halley (Prior) on Apr 21, 2004 at 13:16 UTC
    Update: fixed the case between END and BEGIN markers.
    1 while $text =~ s/(BEGIN_MARKER(?:(?!END_MARKER).)*?)a(.*?END_MARKER) +/$1x$2/;
    This repeats the search and replace until it isn't found anymore. Not exactly "within one regex" but it is still pretty readable.

    The part which reads (?:(?!END_MARKER).)*? will non-greedily accept any characters that aren't starting your end marker. It's a negative-lookahead inside a non-capturing group.

    --
    [ e d @ h a l l e y . c c ]

Re: Greedy Substitution Within a Defined Range
by borisz (Canon) on Apr 21, 2004 at 14:08 UTC
    Use Regexp::Common just for the case, that begin and end markers are nested.
    #!/usr/bin/perl use Regexp::Common 'balanced'; my $text = 'An abitrary BEGIN_MARKER string which END_MARKER could con +tain BEGIN_MARKER just about END_MARKER anything' ; my $q; $q = qr!$RE{balanced}{-begin => "BEGIN_MARKER"}{-end => "END_MARKER"}{ +-keep}!; $text =~ s:$q:local $_ = $1; s/a/X/; $_:eg; print $text;
    Boris
Re: Greedy Substitution Within a Defined Range
by matija (Priest) on Apr 21, 2004 at 13:25 UTC
    Hmmm. How about:
    $text=~s/(BEGIN_MARKER.*?)(.*?)(.*?END_MARKER)/$1$2/g; $relevant=$1; $relevant=~s/a/x/g; # or whatever... $text=s/(BEGIN_MARKER.*?)(.*?END_MARKER)/$1$relevant$2/g;