in reply to Re: Re{4): Template Parsing - Finding tag pairs.
in thread Template Parsing - Finding tag pairs.

The problem with embded languages (like PHP, CF, etc) which use tag-like syntax is that: Proper parser for embeded language should ignore all HTML markup (or any other markup, or any text which looks like markup). It should take in account only its pseudo-tags. Is it possible to make HTML::Parser ignore everything except pseudo-tags? I don't think so but I can be wrong.

--
Ilya Martynov (http://martynov.org/)

  • Comment on Re{6): Template Parsing - Finding tag pairs.

Replies are listed 'Best First'.
Re: Re{6): Template Parsing - Finding tag pairs.
by Juerd (Abbot) on Dec 26, 2001 at 00:45 UTC
    $whatever = 'CFML';
    Having normal HTML in $whatever or $whatever in normal HTML is not a problem with HTML::Parser, if you use the report_tags() method. That'll have the parser ignore unknown tags, leaving non-$whatever tags for what they are.

    So the three points you mention are irrelevant.
    Yes, it IS possible to make HTML::Parser ignore everything except pseudo-tags. That's what I've been talking about all the time - the report_tags() method. *sigh* :)

    2;0 juerd@ouranos:~$ perl -e'undef christmas' Segmentation fault 2;139 juerd@ouranos:~$

      HTML::Parser doesn't want to parse broken HTML. Like in this example (pseudo tag inside real tag).
      use strict; use warnings; my $data = <<DATA; <p <pseudotag>> DATA use HTML::Parser; my $p = new HTML::Parser(); $p->handler('start', \&start_sub, 'text'); $p->report_tags('pseudotag'); $p->parse($data); $p->eof; sub start_sub { my $text = shift; print "$text\n"; }

      --
      Ilya Martynov (http://martynov.org/)