in reply to Parse:RecDescent grammar help

Thank you both for the quick replies. Ofcourse now I have some more questions.

1.) Can you explain the following code? How does$lang->$method(@$item); know which sub-routine to call?

2.) From looking at the output of Data::Dumper for the array and hash subroutines, it looks like this is stored in arrays or nested arrays. Im having difficulty understanding how to extract this data. It's nice that Data::dumper is smart enough to print the structure but how it does it Im not sure.

3.) Can this grammar be modified to allow the following entries?

$a = "hello"

$b = "world"

@array = $a, $b

%hash = key=>@array

etc...

Once again, thank you for your help!

-Phillip

Replies are listed 'Best First'.
Re^2: Parse:RecDescent grammar help
by tachyon (Chancellor) on Oct 26, 2004 at 06:42 UTC

    1) The first element of @item (ie $item[0]) is the PRD rule name. You will notice that in a lot of cases I am grabbing item 1+ and ignoring item 0 because we want the matched data, not the name of the rule. In the three 'method' rules we select 0,2,4/5 from @item which gives us the rule_name, var_name, assign_data. The rule name is the same as the method name. Get it? The rule we match also tells us what function to call to deal with the data.

    2) The data is stored as array_refs or array_refs of array refs. See perlreftut. I have given you an example of how to access a typical value. The parse tree is an array ref, that hold more array refs, which probably hold yet more array refs. The first level of array refs if what we iterate over. We assign that to $item and this is the result of one successful rule parse. @$ syntax gets us an array from our array ref. $ref->[0] gets us item 0, rather than the whole list.

    3) You can modify the grammar to your hearts content. The more complexity you add the more problems you are going to find. You have what looks a hell of a lot like Perl5 syntax and Perl is a bitch to parse. Why not just use a real language and let its parser generate a parse tree for you? I am not sure you have considered just how complex a project what you propose is.

    cheers

    tachyon

      Tachyon, Thanks for the reply...I think I understand my questions now...I had no idea the rule name was passed back as item[0].

      Your right about not considering how complex this project is...the generic languge can be anything, not necessarily what I proposed. Im not quite sure what you mean when you say why not use a real language? Do you have any examples?

      I was trying to come up with a generic language that allows the basic data structures (element, array, hash) that will eventually get parsed and translated to a tool specific language...this way variables only need to be specified in one place and converted if needed. If you have any suggestions, please don't hesitate to share them.

      Once again, I really appreciate all of your knowledge and help!!

      regards,

      Phillip

Re^2: Parse:RecDescent grammar help
by thekestrel (Friar) on Oct 26, 2004 at 10:28 UTC

    Hi Phillip,

    Firstly, as Tachyon says these rules can get really funky really quickly, especialy when you want to try and model nested things of conditional instructions (as I'm finding out for with my tinkering).
    That aside I've remodelled my rules(using my last example code) to accomodate the type of entries you wanted. First here are some definitions.... in my little language these are my types...

    5 # Any number is a 'literal'
    "fluffy" # encased text I call an 'identifier'
    $cuddles # this is a 'variable'
    @animals # this is an 'array'
    %stuff # this is a hash


    Now if you follow the rules you'll see that you can pretty much have any cobination of these...so you can do seksi things like this.. (put this in the text section from before as an example)

    %stuff = { animal => @pets, age => 5, name => "fluffy", colour => $col };


    (Just as a side note the $a = "hello" and $b = "world" should have already worked with my existing program, this bit is so you can embed 'variable's and 'array's in things)


    Replace all the bits in my 'rules' section from before with the following and that should spice things up...
    # --- Rules --- parse : stmt(s?) EOF { $item[1] } stmt : variable ';' { $item[1] } | array ';' { $item[1] } | hash ';' { $item[1] } | <error> arrayelement : term ',' { [ @item[0, 1] ] } | term { [ @item[0, 1] ] } arrayname : ARRAY IDENTIFIER { [ 'array', $item[2] ] } array : arrayname EQUAL arrayelement(s?) { [ @item[2, 4] ] } hashelement : IDENTIFIER HASHASSIGN term ',' { [ @item[0,1 +,3] ] } | IDENTIFIER HASHASSIGN term { [ @item[0,1,3] +] } hash : HASH IDENTIFIER EQUAL '{' hashelement(s?) '} +' { [ @item [0, 2, 5] ] } variablename : VAR IDENTIFIER { [ 'variable', $item[2] ] } variable : variablename EQUAL term { [ @item[0, 2, 4] ] } term : QUOTE IDENTIFIER QUOTE { [ 'identifier', $it +em[2] ] } | LITERAL { [ 'literal', $item[1] ] } | arrayname { $item[1] } | variablename { $item[1] }


    ....and the output for the example I gave you...
    $VAR1 = [ [ 'hash', 'stuff', [ [ 'hashelement', 'animal', [ 'array', 'pets' ] ], [ 'hashelement', 'age', [ 'literal', '5' ] ], [ 'hashelement', 'name', [ 'identifier', 'fluffy' ] ], [ 'hashelement', 'colour', [ 'variable', 'col' ] ] ] ] ];

    Have phun...
    Regards Paul
      Paul,

      Thank-you for the reply...I will try these changes out after Jury Duty :-(

      From the examples yourself and tachyon provided, I realize that the output of what's parsed can become very complex and get nasty real quick. Perhaps I'll need to set some limitations on how many nested statments are allowed...otherwise retreiving this data is going to be a nightmare.

      Thanks again!

      -Phillip


        Phillip,


        Limiting the rules.....Hmm I'm not sure how you do that. Once you make it recursive you can just keep on embedding and it will parse to its hearts content.
        Which then kinda puts the onus on have a smart system to traverse sed data. One the top level of the tree it would be a trivial task to search upper nodes for ones of say type 'array' then search that list for the presence of the correct one.
        I'm kinda enjoying this actually, because now I'm a lot more equipped to play with some of my rules after tinkering with this.


        Regards Paul
      Paul, What version of perl are you using to produce this output. I am getting errors with the following input data.

      my $text = q {@dogs = ["dollar","mack"]; %myHash = {animals => @dogs, age = 5, names=> "fluffy"}; }; my $result = $parser->parse(\$text); OUTPUT ERROR (line -1): Invalid stmt: Was expecting ';' but found "["dollar","mack"];" instead Bad text.

      Thanks, -Phillip


        Phillip,
        This is because the rule format for Array I implemented is the following...

        @array = thing1, thing2, ....
        your example
        @dogs = [ "dollar", "mack" ];
        My method doesn't support the brackets yet. Its hard when you're making it from scratch and it looks like perl not to auto-assume that it would auto perform therse things but you would have to tell it how to do them...

        I removed the brackets and ran with the following... @dogs = "dollar", "mack";
        and it now runs but gives me gumby output see below:
        $VAR1 = [ [ '=', undef ] ];

        ok so mistake on my part =P replace the 'array' rule with this.. array     : arrayname EQUAL arraelement(s?) { [ @item[1,3] ] }
        now we add both lines back in...
        @dogs = "dollar", "mack"; %myHash = {animals => @dogs, age = 5, names=> "fluffy"}; gives the following output =) <code> $VAR1 = [ [ [ 'array', 'dogs' ], [ [ 'arrayelement', [ 'identifier', 'dollar' ] ], [ 'arrayelement', [ 'identifier', 'mack' ] ] ] ], [ 'hash', 'myHash', [ [ 'hashelement', 'animals', [ 'array', 'dogs' ] ], [ 'hashelement', 'age', [ 'literal', '5' ] ], [ 'hashelement', 'names', [ 'identifier', 'fluffy' ] ] ] ] ];

        A few things to note .... =)
        - The reference to '@dogs' in the myHash is just stored as text that is told to be of 'array' in the hash. It would be your job to search the tree so see if you had a valid array called 'dogs' in the tree to get data from, it doesn't auto put the dogs array inside the hash or make any kewl little pointers or anything.
        - I mentioned in my last note that you could pretty much use any combination or literal, itentifier, array or hash, but you can't actually have hashes embedded i.e. hashes of hashes. This is an easy change...just see what I did to the 'array' and 'variable' types.
        - You can easily make is so that you array rule did accomodate brackets too....think along the line of adding something like..
        array : arrayname EQUAL arrayelement(s?) { [ @item[1,3] ] } : arrayname EQUAL '[' arrayelement(s?) ']' { [ @item[1,4] ] }


        Regards Paul =)
        p.s. I'm using v5.8.4