in reply to Regex for removing Template::Toolkit comments?

I would do it in a two-way approach, first splitting up the template into literal parts and TT code, and then stripping out the TT comments, maybe:

use strict; use warnings; use Data::Dumper; $Data::Dumper::Useqq = 1; my $tt = <<'TT'; [% # this is a comment to the end of line foo = 'bar' %] <p>You might have an array in your TT</p> [% foo = bar[5]; %] <p>bw, bliako</p> [%# placing the '#' immediately inside the directive tag comments out the entire directive %] TT my @parts = ($tt =~ /\G( (?:[^\\\[]+) # not a template, not a backsla +sh |(?:[\\].) # an escaped whatever |(?:[\[][^%]) # not a template, [ followed by + whatever |(?:\[%.*?%\]) # within a TT template ) /msgx); @parts = map { s!\s+#.*$!!gm; $_ } # comments up to EOL map { /^\[%#/ ? "" : $_ } # TT comments @parts; warn Dumper \@parts;

This will not deal well with templates containing code containing a literal %]. So, don't do that.

Replies are listed 'Best First'.
Re^2: Regex for removing Template::Toolkit comments?
by bliako (Abbot) on Aug 24, 2018 at 17:46 UTC

    thanks.

    Additional problem is the nested comments unless they are forbidden. Wouldn't I be better of converting comment literals to C comment literals (/* */) and using a nested-comment regex from their?

      I don't understand what you mean by "Additional problem is the nested comments unless they are forbidden."? Can you show example input data where my regular expression fails?

      If you have a regular expression for nested C comment literals, I'm quite sure that it can be trivially converted to a regular expression matching nested TT comments by changing /* to [%# and */ to %]. But I really doubt that TT allows for nested comments anyway.

        It fails on the following which is bad practice on my behalf when I need to comment out entire html sections for debugging. That section may include TT directives or TT comments, e.g.:

        [% # abc [%# xyz %] %]