simonodell has asked for the wisdom of the Perl Monks concerning the following question:

Hello everyone,

I have a problem... I'm not smart enough to write the code to solve a problem I have, and I'm hoping that by writing this essay, maybe I will be able to connect with someone who is smart enough, and who can see why I want to solve the problem in the first place.

So let me tell you a little story.

When I first started my journey into the world of perl, I had no clue about well... er... anything really! My first scripts were very "C".

I carried on like that for a long time... working tangentially to the mainstream world of perl. CPAN was unknown to me, DBIx::Class, catalyst... these were not terms I was at all familiar with.

As a result, and prior to my enlightenment, I trod a certain path which took me to an unknown code place, a place I cannot find anything similar to at all.

I wrote my own templating syntax... but it's more than that, much more.

I fell in love with the syntax I produced, it was neat, tidy and oh so easy to construct complex systems with. But no one else knows about it, no one else can help with my problem.

My very best efforts at making an efficient system from it result in page render times of around 0.03 to 0.12 seconds on a 1.6ghz celeron processor.... not fast enough!

But I cannot look away from what I have found... the simplicity... the elegance of it... this is no ordinary templating markup language, its unique and special and I really want to bring it to the world....

But I am not smart enough to make to make it as efficient as it should be, my best efforts over several years have resulted in a regex-ing eval-ing slow lumbering monster trying desperately to be a beautiful flower.

So I need help. I know I should just use catalyst and DBIx class... I know I would be doing my career a huge favour by throwing in the towel and joining the herd.

But I can't.

Not because I'm not capable of writing standardised code, but because I cannot ignore what I found out there in the lonely wilderness of standards ignorance.

So here I am going to lay bare the syntax model I found, and by doing so I hope maybe, someone out there will just for the craic, have a look at it, and maybe glimpse something of what I see in it, and why I struggle onwards with it. I'm hoping that just maybe someone will see that I'm not just being stubborn, and that there is something about it that is worth playing with, worth exploring, worth coding with superior insight and intelligence than I can bring to the problem... because it's been a long time now, and I have worked and worked at it alone, but I am running out of ideas, insight and IQ points.

Ok...

<say>hello world</say> >hello world (db mode="mask") <query> (use)queries/someSQLquery(/use) </query> <mask> <d>someColName</d><br> [link to="someURL"]<d>someOtherColName</d>[/link] </mask> (/db)
So what is going on here? what is with the extra tag delimiter types... is this garbage?

I understand that many of the readers will already be looking elsewhere, dis-interested, not seeing through the difficulty I have in expressing why it is that I am even talking about this... that's fine, your still here.. let me explain further, maybe you will read on and conclude that it was a waste of time, if so I am sorry, but then maybe you will think hold on... I see what this nutcase is on about, ahhhh... that's cool...

I know from experience of showing my working parser to others that when they see what it does they like it... I just don't know how to get it across the dynamic nature of the thing in static text.

the syntax is made up of tags which can be used in a seemlinly endless number of combinations which all arise from a very simple set of rules.

rule 1 : tags are computed from the innermost towards the outermost. <a><b><c>d</c></b></a>
is computed as c,b,a where the data "d" is handed to the expression c, which then computes and its result is handed to b, which then computes and hands its result to a.

let us say that d = 5 and that c means add 7 and that b means multiply by 2 and a means add 10
the resulting computation looks like this
<c>5</c> = 12 <b>12</b> = 24 <a>24</a> = 34
therefore the result of the expression
<a><b><c>d</c></b></a>
equals 34 when d equals 5

the meaning of the tags is defined by code in plugins corresponding to the tag names... so the code for the plugin "a" would look like this;

$result = $data+10;
where $data is the value contained in the tag when it is executed.

with me so far? I know at this stage this may seem very arbitary, but please bear with me, this system is highly abstract, infact mathematical abstraction is what this is all about.

Rule 2 : tags formed with ( ) brackets are computed prior to their child tags unless those child tags are also type ( )

(a)(b)(c)5(/c)(b)(a) is exactly the same as the above example.

however (a)(b)<c>5</c>(/b)(/a) is not.

in this case the compute order would be b,a,c

b runs first with "<c>5</c>" as its data. b means multiply by two so since its a string rather than a numerical value the result is <c>5</c><c>5</c>

this would then be handed to a which means add ten... again since its a string the result would come out as <c>5</c><c>5</c>10

the last thing to run is c... but this time because a prior tag added an extra <c> tag in when it multiplied it by 2, the c tag executes twice giving the end result :

121210

ok... I guess your probably really scratching your head now... but not about the syntax... but rather why am I so keen on this? what do I see in this... its silly right?

Rule 3 : tags composed of type are computed last

[a][b][c]5[/c][/b][/a]
is the same as before.

however

<a>[b]<c>5</c>[/b]</a>
computes in the order c,a,b ... giving :

1 : <a>[b]<c>5</c>[/b]</a> 2 : <a>[b]12[/b]</a> 3 : [b]12[/b]10
result : 2410

That's all the rules....

but what good is that???

consider this;

(given this="hello") <hello> [use]somefile[/use] </hello> <else> [use]someOtherFile[/use] </else> (/given)
we have two tags here, and some markup

(given) runs first and because of the code in the plugin for given, it selects the matching tag from its children and returns it as the result.... one of the two "use" tags.

the upshot is with just these 3 rules, a complex hierarchy of nested tags can be built up which can define arbitrarily complex behavioural results.

(given this="(c)5(/c)") <12> the answer is 12! </12> <else> the answer was not 12, so therefore c was given a value other than 5 [link to="someurl"]click[/link] here to try again! </else> (/given)
I have developed around 50 plugins for my system... and it does work!

It's just not fast enough....

here is a forum system that is built entirely using these principles : http://67.23.4.149/action.pl

the theory is right.... its just that I don't know enough / am not smart enough to implement an efficient solution.

It's not my fault, I never finished college....

I know _FOR SURE_ that this system is wonderful to work with, all I need to do is make it efficient, and stop using a regexing / evalling old fashioned parsing system.

Please do not tell me to give up and use mainstream methods, I AM NOT GOING TO DO THAT.

worst case scenario is that not a single one of you made it to the end of this document, and I just have to wait for the inevitable action of moores law for the code I have to become production viable... but then again, maybe I have just sparked someone's interest in this... I can only hope so.

It's not stubborness... please don't assume that because I can't make it efficient that it cannot be efficient, or that because I am not able to do it alone that there is no value in this code tangent, or the syntax rules that I have discovered and outlined above.

writing sites using this syntax set is a real breeze, it is wonderful to work with.... I am 100% sure I am onto something important and special... so please...

can you help?

Replies are listed 'Best First'.
Re: Too difficult for me...
by Corion (Patriarch) on Jun 19, 2011 at 07:19 UTC

    From your description, it sounds to me as if you have an XML representation of the Abstract Syntax Tree to execute your code, but it's hard to tell. You posted a wall of text, but no code. I'm not sure what part of it relates to the problem, but you have this one sentence:

    It's just not fast enough....

    Without showing us some code, we can't help you there except with general points. Do less work - don't work on the textual representation of your program, but preparse it into a data structure. Cache expressions instead of re-evaluating them every time. Create static output as files instead of creating the output with every page load.

      You wouldn't like my code....

      My code doesn't need fixing... it needs replacing! the parser I have written implements the syntax and provides a working platform, but it is pre-alpha at best.

      I just read the link you gave me and it seems similar, but the syntax I have is not quite the same as XML, as it has two extra types of tag delimiters ( ) and [ ]

      I'm really shooting in the dark as to how to explain what I have, so please bear with my ignorance (not stupidity, at least I hope not).

      perhaps if I gave you some more examples of the syntax itself you could understand what I am getting at... like I said the code I have which runs it ought to be deleted, and the hard drive it is stored upon melted down!

      anyway...

      Upon receiving a request, the existing system looks up a file in the appropriate folder called body.aXML (aXML is what I call it but I am aware that name is taken, another reason for a rethink)
      listing of an example body.aXML <html> <head> <title><conf>site_title</conf></title> </head> <body> <use>main</use> </body> </html> listing of main.aXML <div id="comments"> <table> (sql mode="mask") <query> SELECT * FROM comments; </query> <mask> <tr> <td><d>comment</d></td> <td>[link action="showuser" userid="<d>userid</d>" ]<d>username</d>[/link]</td> <td><d>timestamp</d></td> </tr> </mask> (/sql) </table> </div>
      In most cases the body.aXML just contains a use tag which calls up a template, which then calls main.aXML but for simplicity I didn't bother putting that detail into the example. The example gives an output of comments from the SQL database, the system itself takes care of connecting to the database etc, leaving the designer free to simply plug together the existing modules using the simple syntax... this massively streamlines the development of new systems, as the modules are reusable and can be stacked together in multiple different abstract ways. for another example, here is a version of main.aXML which takes a query data argument telling it how many results to show;
      <div id="comments"> <table> (sql mode="mask") <query> SELECT * FROM comments LIMIT (qd)limit(/qd); </query> <mask> <tr> <td><d>comment</d></td> <td>[link action="showuser" userid="<d>userid</d>" ]<d>username</d>[/link]</td> <td><d>timestamp</d></td> </tr> </mask> (/sql) </table> </div>
      Where (qd)limit(/qd) would refer to a value passed in the query data key => value hash, under the key named "limit".

        I can only recommend to you to write a compiler that takes your input language and outputs (for example) Perl:

        <div id="comments"> <table> (sql mode="mask") <query> SELECT * FROM comments LIMIT (qd)limit(/qd); </query> <mask> <tr> <td><d>comment</d></td> <td>[link action="showuser" userid="<d>userid</d>" ]<d>username</d>[/link]</td> <td><d>timestamp</d></td> </tr> </mask> (/sql) </table> </div>
        ... could become
        my $output; $output .= <<HTML; <div id="comments"> <table> HTML my $limit = get_qd('limit'); # whatever "qd" is supposed to be my $results = fetch_sql( mode => "mask", query => <<SQL); SELECT * FROM comments LIMIT (qd)limit(/qd); SQL for my $row (@$results) { $output .= <<HTML; <mask> <tr> <td> HTML $output .= $row->{comment}; $output .= <<HTML; </td> <td> HTML $output .= link( action => 'showuser', userid => $row->{userid}, text +=> <<HTML ); $row->{username} HTML $output .= <<HTML; </td> <td> HTML $output .= $row->{timestamp}; $output .= <<HTML; </td> </tr> </mask> HTML }; # (/sql) $output .= <<HTML </table> </div> HTML

        This is basically the same technique that Template uses. The remaining infrastructure of including parts of pages from other code reminds me of HTML::Mason. I think you could learn lots from looking at the respective implementations. The Everything Engine (which this site runs on) is also fairly similar, except that it doesn't try to encode database queries as HTML - it leaves plain Perl for that. Subroutine calls can be encoded as special tags, but I'm not sure that this is an overall good idea.

        Note that I'm no friend of large frameworks, because they usually work for nobody other than the author(s).

        With a more efficient rendering program, it would be possible to add in extra levels of abstraction to quickly and easily implement things such as a database abstraction layer. I could go into that in more detail but I don't want to over do it before the simplicity of the ruleset and structure becomes apparent.
Re: Too difficult for me...
by roboticus (Chancellor) on Jun 19, 2011 at 12:31 UTC

    simonodell:

    After reading your node, I get the impression[1] that you've written your own language based on tags, and your program interprets the tree each time you want to generate a page. I'm thinking that the problem may be that you're reparsing your tree and interpreting the result for each page. Perhaps you should make a preprocessor that will take your tree and generate a program that directly generates the output, something like a compiler.

    If you do this, then you can use the standard profiling tools to find out what parts of the code are taking too long. This would enable you to ask smaller, more detailed questions about specific chunks of code, and would help you find out which parts of your compiler need the most work.

    Notes:

    [1] I found the post hard to read because it doesn't get to the point quickly enough, so I may have missed something important. If so, sorry.

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      Perhaps you should make a preprocessor that will take your tree and generate a program that directly generates the output, something like a compiler.
      Yes I agree that's how it should be done. If the original is newer than the "compiled" version, then re-parse/re-compile.
      I'm not trying to win anyone over, I'm trying to share what I have.

      I've spent a lot of time working at the abstract level provided by the syntax, and I have looked at other templating systems and I find their syntax to be very cumbersome, restrictive and clunky.

      The syntax rocks, my implementation sucks. So I have a choice, keep it all to myself and work out the problems the hard way over however long that takes, or let my baby go out into the world by itself and allow others to do their magic with it.

      I've already put in thousands of hours of my time, I'm not after kudos, congratulations, slaps on the back or anything like that. I just want to share it openly and receive back the benefit of letting other programmers who are wiser and smarter I, working on the problem.

      If I can't convince anyone that what I have worked on for the last few years is of any worth at all, then I will just have to spend the next few years on it as well, and I'm not being a stubborn jerk or anything, I just know what it's like to work within the paradigm and nothing I find elsewhere comes close.

      It's these 3 simple rules, and then the programmer does whatever he/she want's with it... it doesn't have to have an extensive set of commands, or anything like that, I'm not trying to suggest I have an all singing all dancing solution to everything!

      I'm just hoping that someone will grasp the syntax structure and play with it, I'm guessing at this point what I need to do is produce the video file I mentioned earlier and then let people directly download the source code I have already written. Maybe then if they see what it's about and why its so neat an efficient compiler/parser can get written this side of 2016.
Re: Too difficult for me...
by ikegami (Patriarch) on Jul 11, 2011 at 16:29 UTC

    So what is going on here? what is with the extra tag delimiter types... is this garbage?

    Extra to what? Are you pretending this is somehow related to XML? It's not, so don't.

    and it does work!

    Actually, I'm pretty sure it'll crap out for

    (given this="...") <else> the answer is "else"! </else> <else> the answer was not "else" </else> (/given)

    As a side note, I hope the square brackets are handled by some other layer and have nothing to do with the engine you are presenting. If they are handled by your engine, you're needlessly limiting the content to be this UBB-ish markup, and there's absolutely no reason to do so.