I found a few problems when I read through your grammar quickly.

1) By default, anything matching /\s+/ between tokens is ignored. That includes newlines, yet one of your tokens is a newline. I don't think that's going to work. You have to use <skip>.

2) Are you using $::RD_AUTOACTION or <autotree>? You won't get much from the grammar if you don't use either of these, or actions ({ ... }) to return selected tokens at the end of every rule. See 410587 for an example which uses actions to return selected tokens.

3) It would probably be better if you defined program as file(s) /^\Z/ rather than just file(s).

4) You defined fourcc as "'" /.{4}/ "'", which is the same as /'\s*.{4}\s'/ (with the default <skip>). I think you want fourcc: /'.{4}'/.

5) const must be above id in expr, or else true and false will be considered ids instead of bool_consts.

6) type : id | 'code' | 'handle' | 'integer' | 'real' | 'boolean' | 'string'
can be simplified to
type : id

7) I don't know if PRD can handle your expr and binary_op. Fix:

expr : binary_op(s?) term binary_op : term (/[+-*/><]/|'=='|'!='|'>='|'<='|'and'|'or') term : unary_op # Must be above id, array_ref & func_ref | func_call # Must be above id, array_ref & func_ref | const # Must be above id, array_ref & func_ref | array_ref # Must be above id | func_ref # Must be above id | id | parens unary_op : ('+'|'-'|'not') term

8) Actually, unary_op is probably slightly more efficient when written as:

unary_op : '+' term | '-' term | 'not' term

9) All your binary operators all have the same precendance. How to fix:

expr : binary_op(s?) term # Lowest precendance. binary_op : binary_op_2 /and|or/ ... binary_op_8 : binary_op_9 /[+-]/ binary_op_9 : term /[*/]/ # Highest precendance. term : ...

10) That which you called "const" are really literals. Literals are constant, but constants are not necessarily literals.

11) Your definition of id has a space in it, and it shouldn't. Also concerning the defintion of id, you should use (?:...) instead of (...). The former is faster, and you don't need to capture. Result: /[a-zA-Z](?:[a-zA-Z0-9_]*[a-zA-Z0-9])?/

12) Your definition of string_const is way too greedy. It'll match up to the last double-quote in the file. You didn't include the escape mechanism. Finally, you shoulnd't allow newlines in it. Try string_const : /"(?:[^\\"\n]|\\[^\n])*"/


In reply to Re: Syntax highlighting EBNF grammar language by ikegami
in thread Syntax highlighting EBNF grammar language by BUU

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.