ColdFusion-like parser for Perl

Hello fellow Monks. It's that other time that I desire to seek your collective wisdom.

I'm currently busy with a project to develop a ColdFusion-like template parser for Perl. If properly implemented, such a package may prove quite helpful in a number of areas. In one particular case, I hope to use it to test certain (basic) ColdFusion pages on my home box, which only runs the lates version of Apache web server with mod_perl, prior to having them uploaded to a remote box running a full-scale ColdFusion server. (Regretefully, I do not have a telnet access to the box as that requires extra $$ and a dedicated host (I'm with www.webhosting.com)).

Well, even this alone wouldn't have been enought of a compelling reason for me to concern myself with a project of this size. One other thing that drives me is how this package could than be extended to handle templates of various other formats (such as ColdFusion, ASP, etc.). The advantage of using ColdFusion-like style, for me in particular, is that at my place of work we have all our design folks working with ColdFusion templates only. And since ever growing number of our scripts rely on one form of a templating system or another, I figured it would be useful to have a ColdFusion-like templating package for Perl as well. This way, our design people won't have to learn some new and rather wierd template scripts (such as offered with current HTML::Template.. whereas it's really great by me, not every designer will feel the compelling urge to learn it when he/she already knows a bunch of other templating scripts -- one of them being ColdFusion of course ;-> ).

So, when it comes to the actual design and implementation phase of this project, I think i'd appreciate some input/suggestions. As of now, I've jotted down an 'abstract' of some of the 'classes' that may be involved along with related information.

I'm not sure how complex it should go or rather how I could manage to keep it as simple (while still being extendable) as possible (since any simple thing evolves in a complicated beast eventually ;-).

#################################
# classes (packages) involved 
#################################

#
# main package 
# (at least that's where everything starts ;)
#
package Template::Parser;


###
# CLASS: TAG
###
#
# Implements abstract class for real
# tag sub-classes (such as TAG::Output and etc.)
#
package Template::Parser::TAG;

###
# CLASS: TAG::Output
###
#
#
package Template::Parser::TAG::Output;

###
# CLASS: TAG::Control
###
#
# base class for various control tags.
#
# e.g. <cfif>,<cfswitch> etc.
package Template::Parser::TAG::Control;

###
# CLASS: TAG::Control::If
###
#
# encapsulates conditional tags (e.g. <cfif/> in ColdFusion)
#
package Template::Parser::TAG::Control::If;

###
# CLASS: TAG::Control::VAR
###
#
# base class for various variables.
#
package Template::Parse::VAR;

###
# CLASS: TAG::Control::VAR::Simple
###
#
# simple variable (err. pretty much any
# Perl non-object variable will do)
#
package Template::Parse::VAR::Simple;


###
# CLASS: TAG::Control::VAR::Query
###
#
# special query object
# (think of ColdFusion queries ;)
#
package Template::Parse::VAR::Query


###
# CLASS: EXPR
###
#
# class handling complex expressions
#
# EXAMPLE:
#
# template =
#    <cfset bar = 2>
#    <cfset foo = bar * 2>
#
# will force Parser to create
# an EXPR object from "bar * 2".
# An EXPR object should than be
# able to parse the expression.
#
# There should be a method for an EXPR
# object to query a variable.
#
# I think this could be achieved
# by having the Parser object pass
# a reference to itself so that
# the newly created EXPR object
# can query for any variable value
# as needed (say, when it's turn to
# process the expression has come)
#
# This may also allow for 'propagated'
# query whereby a value for the
# first occurance of a variable
# is returned.. which in turn
# will create nice variable scoping.
#
package Template::Parse::EXPR


#
# Other HTML text.
# note: might not be needed as my tag
# stack will hold 'scalar' refs for
# plain text that has nothing to parse in it.
package Template::Parse::TEXT;

1;


__END__

=head1 TECHNICAL NOTES

=head2 PARSING
Implemented by _parse()

Algorithm:

Sample template:

<html>
<body>
   Hello World, <cfoutput>#name#</cfoutput>!
  <cfif foobar eq 1> foobar! </cfif>
</body>
</html>

1. build tag stack
   1.1 split template by tag separators (default '<')
       E.g.: will get

       0 <html>
       1 <body> Hello World
       2 <cfoutput>#name#
       3 </cfoutput>!
       4 <cfif foobar eq 1> foobar!
       5 </cfif>
       6 </body>
       7 </html>


   1.2 For each recognizable tag, build a new object
       using tag/object mappings. For example like this
       (largly simpliefied version):

       %tag_mapping = (
            'cfoutput' => 'Template::Parse::TAG::output',
       );

       Therefore, for the first <cfoutput> tag (at 2)
       a Template::Parse::TAG::Output object will be
       instantiated.

       Now, I'm wondering how to do this.....
       Here's one thought:

         1. the main Parser object is responsible for
            finding a pair of tags and instantiating
            a new TAG object by passing whatever belongs
            inside the pair of tags.  In that sense,
            a TAG::output object will be created by passing
            this text to it (and nothing else):
            "<cfoutput>#name#</cfoutput>"

            The new tag object's task will then rest in
            dealing with parsing this little piece of text.
            For example, it may instantiate yet another
            Parser object passing it the string '#name#' (or whatever
            lies inside the pair of tags).  The parser will
            than notice that the text it has received contains
            a single variable and create a new VAR object.

            The tag object will control the parser object
            to do one thing or another.  Also, since certain tags
            may not contain other tags inside them (such as
            <cfoutput/>, which can't hold another <cfoutput>...
            whereas, <cfif/> may be nested).  This could be assured
            within the parser by checking tag's nested property.
            This could be accomplished by coding TAG::Output
            class to instantiate its objects with the 'nested' attribu
+te
            set to 0 by default.  Remember that TAG::Output has
            Template::TAG as it's parent and, therefore, the
            nested property may be defined in that base class
            as a way of assuring that child objects contain
            this attribute.


=head1 CONTROL FLOW

It's rather simple for a template.  With the exception
of various control structures such as loops/switches etc
a template is parsed top-down.  Appropriate loop/control
tag objects will handle the rest.

As for the basic top-down approach, the main Parser object
should maintain a proper sequence/ordering of tags that
it encounters during parsing.  Since examples work the
best, let me bring up a few:

EXAMPLE 1:

Consider a template:
-------------------
<html>
<body>
   Hello World, <cfoutput>#name#</cfoutput>!

  <cfif foo eq 1>
     <cfif bar eq 1>
        foobar!
     <cfelse>
        foo...
     </cfif>
  </cfif>

</body>
</html>
-------------------

The Parent instance of Parser will have knowledge of the following
tags that are marked with '*' (note that the topmost Parser
object knows nothing of individual tag objects' contents)

-*- 1 Parse::Text --
<html>
<body>
   Hello World,

-*- 2 TAGS::Output --
'#name#'

-*- 3 TAGS::Control::If --
CONDITION:
 1  -- Parse::EXPR --
      'foo'
 2  'eq'
 3  -- Parse::VAR --
       1

BLOCK_TRUE:
   -- Parse::Control::If --
   CONDITION:
     1  -- Parse::EXPR --
           'bar'
     2  'eq'
     3  -- Parse::VAR --
           1

   BLOCK_TRUE:
      -- Parser --
         -- Parser::Text --
            'foobar!'

   BLOCK_FALSE:
      -- Parser --
         -- Parser::Text --
            'foo...'

BLOCK_FALSE: nil

-*- 4 Parse::Text --
</body>
</html>

-----------------------------
With such structure in place (which is a major piece of work
in the entire parsing process), it should be possible to
construct final HTML document by starting at the first
tag and keep transferring 'control' all the way to the bottom
tag object.


=head2 VARIABLE SCOPES

Read notes under Template::Parse::EXPR.
Referring to previous tag stack diagram,
allowing various blocks of Template 'code' to be
nested (through the use of nested tags for example)
serves us good in that variable scoping may
now be easily achieved by doing successive queries
up the stack tree and returning the first value
thus found.

...
[download]

"There is no system but GNU, and Linux is one of its kernels." -- Confession of Faith

Comment on ColdFusion-like parser for Perl Download Code