Here's another approach...
# takes a marked-up string, a regular expression to determine which # tag(s) we're interested in, and a code reference which will do the # sub-string transformation. returns the modified string. sub parse_and_replace { my ( $string, $tag_match, $transform_sub ) = @_; my @context_stack; my %deferred_transforms; # loop matching tags of the form {word} and {/word} while ( $string =~ m!(\{(/)?(\w+)\})!g ) { my ( $tag, $tag_length ) = ( $3, length($1) ); my $is_close = $2 ? 1 : 0; if ( $is_close ) { # pop and possibly transform on finding a matching close tag # syntax check: properly nested? my $popped = pop @context_stack; if ( $tag ne $popped->{tag} ) { die "close '${\( $popped->{tag} )}' mis-matched with open '$ta +g'\n"; } # save start index and length of tag content if we match the # tag_to_match param. if ( $tag =~ /$tag_match/ ) { my $start = $popped->{pos}, my $length = pos($string) - $popped->{pos}; my $text = substr ( $string, $start, $length ); if ( ! $deferred_transforms{$text} ) { $deferred_transforms{$text} = $transform_sub->("$text"); } } } else { # just push onto the context stack on finding an open tag push @context_stack, { tag => $tag, pos => pos($string) - $tag_l +ength}; } } # now do the replacements my $error; foreach my $text ( keys %deferred_transforms ) { $string =~ s/$text/$deferred_transforms{$text}/g; } return $string; } # and to invoke: my $string = q( Outside. {tag} Inside level 1. {tag} Inside level 2. {/tag} Inside level 1. {/tag} Outside. ); my $sub = sub { $_[0] =~ s/\{tag\}(.+)\{\/tag\}/--Marked--\n$1\n--EndMarked--/gis; return $_[0]; }; print "RESULT: " . parse_and_replace ( $string, 'tag', $sub );

I do think it's worth noting that parsing/manipulating recursively-nested markup is not trivial. (You should test the heck out of any custom-written solution -- including the one I just supplied -- before you even start to think about trusting it for your application.) I second Merlyn's advice about rolling modules into your distribution; Parse::RecDescent is powerful, flexible and de-bugged!

It isn't possible, as you're discovering, to do this kind of parsing with simple regexps. Unless you're willing to put severe limits on allowed markup structure, you'll need to parse recursively (or cheat a bit and save some context, as my code does, and as IO's code does in a much niftier way).

And parsing is only half the battle -- the transformation can be tricky, too. Unless you're willing to limit the kind of transformation that's allowed, you have to build a tree, do the transformations on each tree node, then put the tree back together into a string. (My sub above sidesteps tree-ization by limiting transformations to simple, stateless, one-to-one mappings between a given "{tag}content{/tag}" string and a "result" string.)

Kwin

In reply to Re: Properly transforming strings with nested markup tags by khkramer
in thread Properly transforming strings with nested markup tags by tocie

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.