Re: Regex for removing Template::Toolkit comments?
by Corion (Patriarch) on Aug 24, 2018 at 17:04 UTC
|
I would do it in a two-way approach, first splitting up the template into literal parts and TT code, and then stripping out the TT comments, maybe:
use strict;
use warnings;
use Data::Dumper;
$Data::Dumper::Useqq = 1;
my $tt = <<'TT';
[% # this is a comment to the end of line
foo = 'bar'
%]
<p>You might have an array in your TT</p>
[%
foo = bar[5];
%]
<p>bw, bliako</p>
[%# placing the '#' immediately inside the directive
tag comments out the entire directive
%]
TT
my @parts = ($tt =~ /\G(
(?:[^\\\[]+) # not a template, not a backsla
+sh
|(?:[\\].) # an escaped whatever
|(?:[\[][^%]) # not a template, [ followed by
+ whatever
|(?:\[%.*?%\]) # within a TT template
)
/msgx);
@parts =
map { s!\s+#.*$!!gm; $_ } # comments up to EOL
map { /^\[%#/ ? "" : $_ } # TT comments
@parts;
warn Dumper \@parts;
This will not deal well with templates containing code containing a literal %]. So, don't do that. | [reply] [Watch: Dir/Any] [d/l] [select] |
|
| [reply] [Watch: Dir/Any] [d/l] |
|
I don't understand what you mean by "Additional problem is the nested comments unless they are forbidden."? Can you show example input data where my regular expression fails?
If you have a regular expression for nested C comment literals, I'm quite sure that it can be trivially converted to a regular expression matching nested TT comments by changing /* to [%# and */ to %]. But I really doubt that TT allows for nested comments anyway.
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
|
|
Re: Regex for removing Template::Toolkit comments?
by tybalt89 (Monsignor) on Aug 24, 2018 at 21:05 UTC
|
Here's a first try at a little recursive parser that will strip nested items.
If you have a better test case (or a counter-example) please let me know.
#!/usr/bin/perl
# https://perlmonks.org/?node_id=1221039
use strict;
use warnings;
$_ = <<END;
before
[% # this is a comment to the end of line
foo = 'bar'
%]
<p>bw, bliako</p>
[%# placing the '#' immediately inside the directive
tag comments out the entire directive
%]
[% outside %]
[%# placing the '#' immediately inside the directive
tag comments out the entire directive
[% inside %]
%]
after
END
print stripcomments();
sub stripcomments
{
my $answer = '';
$answer .=
/\G\[\%#/gc ? stripcomments() x 0 :
/\G\[\%/gc ? '[%' . stripcomments() =~ s/#.*//gr . '%]' :
/\G\%\]/gc ? return $answer :
/\G./gcs ? $& :
return $answer while 1;
}
Outputs:
before
[%
foo = 'bar'
%]
<p>bw, bliako</p>
[% outside %]
after
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
tybalt89, giveth with one hand and taketh away with the other...
I am struggling to convert it to a function with input parameter... Getting there (in a biblical sense) ...
| [reply] [Watch: Dir/Any] |
|
#!/usr/bin/perl
# https://perlmonks.org/?node_id=1221039
use strict;
use warnings;
my $someTTstring = <<END;
before
[% # this is a comment to the end of line
foo = 'bar'
%]
<p>bw, bliako</p>
[%# placing the '#' immediately inside the directive
tag comments out the entire directive
%]
[% outside %]
[%# placing the '#' immediately inside the directive
tag comments out the entire directive
[% inside %]
%]
after
END
print stripcomments($someTTstring);
sub stripcomments
{
@_ and local $_ = shift;
my $answer = '';
$answer .=
/\G\[\%#/gc ? stripcomments() x 0 :
/\G\[\%/gc ? '[%' . stripcomments() =~ s/#.*//gr . '%]' :
/\G\%\]/gc ? return $answer :
/\G./gcs ? $& :
return $answer while 1;
}
Like this?
| [reply] [Watch: Dir/Any] [d/l] |
|
|
|
Re: Regex for removing Template::Toolkit comments?
by LanX (Saint) on Aug 24, 2018 at 17:33 UTC
|
I never used TT!
... but after browsing thru the docs, I wouldn't be surprised if there was a way to process a template to another template, and filter the content by hooking in.
HTH! :)
update
(from another thread) "you could subclass Template::Parser / Template::Directive"
| [reply] [Watch: Dir/Any] |
|
filter the content by hooking in
Out of curiosity I looked into this a bit, and it turns out that hacking/hooking into Template::Parser (via Template::Directive, Template::Grammar, or even Parser.yp) is difficult, because it looks like Template::Parser::_parse drops the original source text and doesn't pass it into the handlers. But for a first step, all that's needed are the tokens, which can be provided by Template::Parser::split_text... but careful with the following, I haven't tested with a lot of different cases yet to see if there might be token types this doesn't handle.
#!/usr/bin/env perl
use warnings;
use strict;
use Data::Dump qw/dd pp/;
use Template::Parser;
my $text = <<'END';
before
[% # this is a comment to the end of line
foo = 'bar'
%]
<p>bw, bliako</p>
[%# placing the '#' immediately inside the directive
tag comments out the entire directive
%]
[% outside %]
after
END
my $parser = Template::Parser->new();
my $tokens = $parser->split_text($text);
#dd $tokens; # Debug
my $o = '';
for (my $i=0; $i<@$tokens; $i++) {
if (ref $tokens->[$i]) {
my $text = $tokens->[$i][0];
#dd $text; # Debug
$o .= "[% $text %]";
}
elsif ($tokens->[$i] eq 'TEXT') {
my $text = $tokens->[++$i];
#dd $text; # Debug
$o .= $text;
}
else { die pp($i,$tokens->[$i]) }
}
print $o;
__END__
before
[% # this is a comment to the end of line
foo = 'bar' %]
<p>bw, bliako</p>
[% outside %]
after
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
| [reply] [Watch: Dir/Any] |
|
some 2-second later edits below...
Thanks, your code's great. To do what LanX proposed looked to me too scary(=switch to another task and read their manuals, feel free to downvote my human ingredients). And using regex's is too fragile without knowing the full spec of TT, i.e. are nested comments allowed and how to deal with [% and [%#] and [% # inside strings as Corion said. So let TT do it seems the right way to me.
Edit2: To be fair to the regex solutions: it was me who asked for a regex in the first place.
| [reply] [Watch: Dir/Any] [d/l] |
|
| [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] |