in reply to matching and mysterious captures

If you modify your code to

 echo -e 'zero\none\ntwo\n\n    three' | perl -0777 -pe 's{\s*(one\ntwo)\s*?( *)}{\n\n<begin block>\n\n$1\n\n<end block>\n\nx$2x}'

so you have a clear delimiter around your second match, your get:

zero <begin block> one two <end block> xx three

By swapping your space matching to non-greedy, you are no longer consuming the newlines preceding 'three'. You can get your expected result by using the multiline modifier (see Modifiers in perlre) combined with a line start metacharacter ^;

echo -e 'zero\none\ntwo\n\n    three' | perl -0777 -pe 's{\s*(one\ntwo)\s*^( *)}{\n\n<begin block>\n\n$1\n\n<end block>\n\n$2}m'

yields

zero <begin block> one two <end block> three

Replies are listed 'Best First'.
Re^2: matching and mysterious captures
by Allasso (Monk) on May 05, 2011 at 17:15 UTC
    I see, I was matching the minimum which was 0 \s's and 0 spaces.

    Your example works, except in the case of:
    echo -e 'zero\none\ntwo three' | perl -0777 -pe 's{\s*(one\ntwo)\s* +^( *)}{\n\n<begin block>\n\n$1\n\n<end block>\n\n$2}m' zero one two three
      What do you expect to get for the case you've listed? This falls outside any of the cases you've described above - see I know what I mean. Why don't you?. If you expect to get
      zero <begin block> one two <end block> three
      which cleans out blank lines but keeps the spaces before three, you can use a non-capturing group (?:...) combined with the ? quantifier to make the whitespace matching conditional:

       echo -e 'zero\none\ntwo    three' | perl -0777 -wpe 's{\s*(one\ntwo)(?:\s*^)?( *)}{\n\n<begin block>\n\n$1\n\n<end block>\n\n$2}m'

        ah yes, that's what I was looking for. I knew something like that would work, but I couldn't quite get the right combination.

        Sorry for the confusion, no, I didn't state my needs very clearly.