In addition to using either the non-greedy quantifier (
.*?) or skipping up to the next > (
[^>]*), you also want to capture the text up through the matching end-tag, for which you just need a non-greedy quantifier inside capturing parens. So your regex should look like
m{<APPEND\b[^>]*>(.*?)</APPEND>}
This puts the text between the tags into $1; if you really want the ending tag, too, just move the paren. I added the /b to make sure you only match <APPEND> tags, and not, e.g., <APPENDIX>. The only caveats on this are:
- You might want to add a /i modifier to the match, in case someone adds the tags in lower case.
- If there's ever a chance of a '>' appearing in the attributes of the tag, you need something more complicated.
The following (untested, but based on Friedl's Mastering Regular Expressions) should work:
m{<APPEND\b(?:"[^"]*"|'[^']*'|[^'">])*>(.*?)</APPEND>}
HTH,
--roundboy