This document describes alternative parsing strategy and compares it to the earlier thought approach. For reference, here's original 'algorithm':

-------------------------- ColdFusion::ParseTree used by : ColdFusion::Parser, ColdFusion::TAGS::* created by : ColdFusion::Parser, ColdFusion::TAGS::* uses : ColdFusion::ParseRules Munches on a chunk of a ColdFusion-like template producing a tag parse tree in the end. Here's it's parse logic in point form: 1. Search for first recognized tag (consulting ColdFusion::ParseRules object). 2. If found, create a new tag object tasked to handle this tag (based on tag<->object mapping supplied by the ColdFusion::ParseRules object). 3. Complete initializing the newly created tag object by searching for it's end tag (in the template text) and setting tag's body to whatever's found enclosed within the pair opening/closing tags. 4. Repeat until entire template is parsed. Note: the first instance of a ParseTree class (initialized inside Parser class) will thus create 1-st level of the parse tree. The task of completing entire tree rests on nested tags each of which will use ParseTree objects to transform their bodies into tag trees. Here's an example of how 1-st level parse tree might look like: ---------------------------

This had obvious difficulties with deeply nested tags as those would require creating multiple instances of ParseTree class to parse nested blocks in source template individually. Second approach looks at a different 'associative' or 'chain' approach. The strategy consists of creating new tag objects upon first encounter of corresponding tag. Following tags are then looked at with respect to earlier found tags. Thus, any nested tag would be linked to its parent, therefore, also establishing a nice 'control flow' chain. To put words in perspective, consider this example: Source template:
--------------------- <HTML> <HEAD> </HEAD> <BODY> Hello World! Welcome, <cfoutput>#foo_name#</cfoutput>! <cfif bool eq 1> <cfif foo = bar> <cfif bar = foo> </cfif> </cfif> BOOL is true! <cfelse> BOOL is false! </cfif> </BODY> </HTML> ---------------------

Here's how this template will be parsed without having to worry much about 'nestedness' of the CFML code.

1. Look for any recognizable CFML text (variables/tags). 2. Found an opening tag, verify tag<->tag_class mappings and instantiate a new tag object of class that is mapped to handle this tag. (also, mark beginning of a new 'block' by either pushing this tag into a special tag stack or setting a variable etc.)

(e.g. First tag to be found is <cfoutput>)

3. Found a variable (enclosed by a pair of ##), create variable object (CFML::VAR) to handle it. Also, check for any tags that might be immediate owners of current 'block' and associate this variable with that tag object. (This is the 'linking' part)

(e.g. since <cfoutput> tag is the immediate 'owner' of the block of cfml code that contained the #foo_name# variable, this newly instantiated CFML::VAR object will be associated with the cfoutput object)

4. Closing tag found... look up tag owner of this 'block' and complete its initialization.

(e.g. cfoutput tag object will be officially 'ready' to go)

5. Found opening <cfif> tag. By checking tag mappings, Parser::TAGS::Control::If object is blessed into existance to handle the tag. The new tag object will also be a parent of any chunk of CFML following it.

6. Found another opening tag. Again, will create Parser::TAGS::Control::If object and associate it with the one that is currently the opening tag for this block (tag object instantiated in step 5). This new tag will now become current 'block owner' for any CFML code to follow.

7. Found yet another (whew...) opening <cfif> tag. Go through procedures similar to step 5 & 6. This new tag is now 'block owner'.

8. Found closing tag! current opened tag is 'finalized' and gives away his 'block' ownership priviledges to his predecessor (step 6).

9. Found another closing tag... do the same. (block owner is tag found in step 5).

10. Found some random piece of 'text', associated it with current block owner (tag in step 5).

*11. Found some unhandled tag 'cfelse', just associate it to current block owner. (I think that Parser::TAG::Control::if should handle intricate details of it's behaviour. For example, upon receiving this <cfelse> tag it may know that it's 'alternative' block is to come and do whatever is required internally)

12. Some random piece of non-CFML text found. Associated it with current block owner (tag @ step 5).

13. Closing tag found. Complete tag created in step 5.

14. Whew.. that's it! after this, we should have a ready to be 'executed' tag tree.

---- * marks points that I'm not quite 100% sure of yet.

The second approach, seems to be more straightforward? It also eliminates my earlier concerns for deeply nested code.



"There is no system but GNU, and Linux is one of its kernels." -- Confession of Faith

In reply to Alternative parsing algorithm. by vladb
in thread ColdFusion::Parser design by vladb

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.