in reply to (tedv)Re: Template system and substitution
in thread Template system and substitution

Um, if we had single-character delimiters, then the suggestion in Death to Dot Star would apply; use a negated character class in place of "dot":

m# \[ [^\]]* \] #x
but that isn't the case here.

If you are wanting me to do that complex trick to avoid matching a closing (or even opening) delimiter in the middle, then I don't think you read the thread. Lots of good people tried and failed to get it right. I don't recommend doing that trick since I've never seen it proposed (even by amazing people) without someone finding an error in it.

Plus, it wouldn't gain us anything in this case. If I see an opening delimiter in this situation, I want to force the match to start there. It would be an error to have extra, unmatch starting delimiters. We aren't trying to parse English here.

As for matching nested blocks, that pretty much violates the original design. For defensive programming, I'd probably have the expand() routine look for and warn of opening delimiters in the match string since these might indicate that a closing delimiter was dropped or munged. But to be completely defensive would require more work than that.

        - tye (but my friends call me "Tye")

Replies are listed 'Best First'.
(tedv)Re2: Template system and substitution
by tedv (Pilgrim) on Nov 17, 2000 at 04:12 UTC
    Well understandably the paranethesis matching of this is extremely difficult. It's my intuition that regex isn't the right solution-- reformulating the problem is. :) At the very least, you could just rip out all []s and get back to single character pairing with nesting. What strikes me as odd is why the data would even be in that format to begin with... Surely you could come up with a more meaningful output format...

    -Ted

      Well if you code up an alternate way to parse this, then I'd be interested in seeing a benchmark comparison. I think my solution was pretty simple, efficient, and correct.

      My suggestion would be to use a templating module (probably Template Toolkit based on what others say about these things). But if that isn't acceptable for whatever reasons, I still stand by my suggestion as quite reasonable.

      I don't see how riping out brackets would help much since there will probably be plenty of brackets that aren't delimiters both inside and outside of the delimited blocks.

      Thinking of how I'd catch all unmatched delimiters, I'd probably do this:

      my %start= qw( [( )] [| |] [{ }] ); my %end= reverse %start; my $expect= ""; # Closing delimiter we expect, if any. my $code; for my $chunk ( split m#( \[ [(|{] | [)|}] \] )#x, $template ) { if( "" eq $expect ) { if( $start{$chunk} ) { $expect= $start{$chunk}; $code= ""; } else { warn "Unmatched $chunk\n" if $end{$chunk}; print $chunk; } } else { if( $chunk eq $expect ) { print expand( $code ); $expect= ""; } else { $code .= $chunk; warn "Found $chunk inside $end{$expect} $expect block\n" if $start{$chunk} || $end{$chunk}; } } }

              - tye (but my friends call me "Tye")

        ...and I just realized one minor way in which this is inferior to my original solution.

        This solution always interprets [|] as a starting delimiter, never as an ending delimiter. But given the unlikelyhood of needing to embed code that ends in [ and the (moderate) difficulty in "fixing" this solution in this respect, I'd just document this as a limitation.

                - tye (but my friends call me "Tye")