(OT) Yapp rules and resolving reduce/reduce errors

dragonchild has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: (OT) Yapp rules and resolving reduce/reduce errors by pc88mxer (Vicar) on Apr 10, 2008 at 19:33 UTC
I've run across problems like this when writing parsers. The way I would do it is to liberalize the grammer and leave the checking of the ident to a later processing stage. For instance, I would probably use something like: `ident : IDENT \| INDENT_QUOTED \| ident '.' IDENT \| ident '.' IDENT_QUOT +ED ;` [download] and not have the grammar distinguish between `table_ident`, `column_ident` and `database_ident`. The checking of the proper kind of ident can be done later. This approach also generally makes it easier to recover from errors. Update: The reduce-reduce conflict is this: when trying to parse a `column_ident` and the current input token is `ident` with look-ahead of `.`, the following reductions are possible: `database_ident -> ident # i.e. column_ident -> ident '.' ident '.' i +dent table_ident -> ident # i.e. column_ident -> ident '.' ident` [download] One solution would be to reverse the recursion: `database_ident: ident ; table_ident: ident \| ident '.' database_ident ; column_ident: ident \| ident '.' table_indent ;` [download] Then when the look-ahead is '.' the grammar knows exactly what production to try to complete.	[reply] [d/l] [select]
Re: (OT) Yapp rules and resolving reduce/reduce errors by mr_mischief (Monsignor) on Apr 10, 2008 at 19:46 UTC
Your problem, which I'm sure you've figured out, is that you have `column_ident` and `table_ident` reducing from the same token stream ( 'ident . ident' ). One way to deal with this is in a shift-reduce parser to break up your grammar into different nonterminals. Starting from the statement level down might help. Combining like nonterminals might, too. `table_and_column: ident '.' ident \| ident ;` [download] Then your statement can look something like: `statement: stmt_type from_into table_and_column set table_and_column where \| stmt_type from_into table_and_column ... ;` [download] (with suitable null options). There are open source and I think even public domain grammars for SQL. They should be available in both LL and LR. Perhaps you could start with one and translate it into YAPP instead of creating the grammar from scratch? One such SQL grammar, written in the EBNF notation of the Gold Parsing System, is apparently Ansi 89 compliant.	[reply] [d/l] [select]
Re: (OT) Yapp rules and resolving reduce/reduce errors by ikegami (Patriarch) on Apr 10, 2008 at 23:35 UTC
The problem is that `column_ident` can match `ident '.' ident` in two different ways: `column_ident table_ident '.' ident database_ident '.' ident ident '.' ident` [download] `column_ident table_ident '.' ident ident '.' ident` [download] pc88mxer's solution to defer identifier validation until later is sound, but here's one that does it during parsing: `database_ident: ident ; table_ident: ident '.' ident \| ident ; column_ident: ident '.' ident '.' ident \| ident '.' ident \| ident ;` [download]	[reply] [d/l] [select]
Re^2: (OT) Yapp rules and resolving reduce/reduce errors by dragonchild (Archbishop) on Apr 10, 2008 at 23:55 UTC
That's the solution I'm using, but it means that any change to how database_ident is defined needs to be propagated by hand. Isn't there a way of saying "In the case of a reduce/reduce conflict, resolve in this fashion." ? My criteria for good software: Does it work? Can someone else come in, make a change, and be reasonably certain no bugs were introduced?	[reply]
Re^3: (OT) Yapp rules and resolving reduce/reduce errors by ikegami (Patriarch) on Apr 11, 2008 at 00:08 UTC
No idea. I don't know anything about Yapp (or yacc, for that matter). If you really think `database_ident` is likely to change, you could go with: `database_ident: ident ; table_ident: ident ; column_ident: ident ; qualified_database_ident: database_ident ; qualified_table_ident: database_ident '.' table_ident \| table_ident ; qualified_column_ident: database_ident '.' table_ident '.' column_ident \| table_ident '.' column_ident \| column_ident ;` [download]	[reply] [d/l] [select]
Re: (OT) Yapp rules and resolving reduce/reduce errors by casiano (Pilgrim) on Apr 11, 2008 at 13:04 UTC
Dear Dragonchild, An alternative subject to your design constraints (no expanding) is to move the conflict from reduce-reduce to shift-reduce and solve it using priorities: `%start column_ident %left DUMMY %right '.' %% column_ident: table_ident '.' ident \| table_ident ; table_ident: ident '.' database_ident \| ident %prec DUMMY ; database_ident: ident ; ident: IDENT \| IDENT_QUOTED ; %%` [download] Now, when I compile with Parse::Eyapp I have no conflicts `pl@nereida:~/LEyapp/examples$ eyapp -v dc2.eyp pl@nereida:~/LEyapp/examples$` [download] I have written Parse::Eyapp that expands Parse::Yapp with capabilities similar to Parse::RecDescent. Consider using it. Hope it will help Best Casiano	[reply] [d/l] [select]
Re: (OT) Yapp rules and resolving reduce/reduce errors by casiano (Pilgrim) on Apr 12, 2008 at 09:39 UTC
Mmmm ... In the previous solution above I changed your left recursive rule for `table_ident` by a right recursive rule. If you don't like this, you can use this alternative solution: `pl@nereida:~/LEyapp/examples$ cat -n dc3.eyp 1 %start column_ident 2 3 %left DUMMY 4 %right '.' 5 6 %% 7 column_ident: 8 table_ident '.' ident 9 \| table_ident 10 ; 11 12 table_ident: 13 database_ident '.' ident 14 \| database_ident %prec DUMMY 15 ; 16 17 database_ident: 18 ident 19 ; 20 21 ident: 22 IDENT 23 \| IDENT_QUOTED 24 ; 25 %%` [download] This solution also avoided conflicts: The only shift-reduce conflict is solved in favor of "shift" (that I supppose is what you need here): `pl@nereida:~/LEyapp/examples$ eyapp -v dc3.eyp pl@nereida:~/LEyapp/examples$ head -15 dc3.output Conflicts: ---------- Conflict in state 1 between rule 4 and token '.' resolved as shift. Rules: ------ 0: $start -> column_ident $end 1: column_ident -> table_ident '.' ident 2: column_ident -> table_ident 3: table_ident -> database_ident '.' ident 4: table_ident -> database_ident 5: database_ident -> ident 6: ident -> IDENT 7: ident -> IDENT_QUOTED` [download] Best Casiano	[reply] [d/l] [select]