Monks,
I am trying my hand at parsing data with Parse::RecDescent for the very first time. I managed to cobble together a grammar without too much difficulty, and it successfully parses the example files I have fed to it. So far so good.
I am having some difficulties, however, figuring out how to handle comments. My language allows C-style comments (/* .. */) as well as C++ line comments (// ...)1. I would like to process these comments in my parser. So setting $Parse::RecDescent::skip to a suitable regex to skip over the comments is not OK (although I have done that for now, just to get going).
The only option I see is to define the rules for a comment2:
... and then sprinkle this production liberally throughout the rest of the grammar:comment: c_comment | line_comment c_comment: qr! /\* # C-Style comments open with a "/*"... (?: # followed by... [^*] # non-"*" characters | \*(?=[^\/]) # or a "*" and a non-slash character )* \*/ # ...and closed by a "*/" !x; line_comment: qr!//[^\n]*!;
# Yuck! rule1: comment(s?) subrule1 | comment(s?) subrule2 rule1: comment(s?) subrule3 comment(s?) subrule4
Can any monks out there lead me down the correct path? While I would expect the above to do what I want, it feels like a hack. Is there a smart way around this?
Thanks for your help.
Update 1: Fixed c-comment regex as per [id://Anonymous Monk|Anony-Monk]'s suggestion.
Update 2: The "big picture":
I am parsing configuration files, with a goal of amending small parts of them while leaving the majority of the file unchanged. So even though comments and whitespace are irrelevant to the semantics of the file, I need to keep track of them so they can be rendered when I print the amended version of the configuration file back out.
So if I were to change "sub-setting1" in the example below from "value1" to "value2", I would want to go from...
... to ...setting = ( sub-setting1 = "value1"; optimize = "12"; // unreliable! /* foo = "bar"; Commented out, not working right now*/ );
The point being, I don't lose the formatting.setting = ( sub-setting1 = "value2"; optimize = "12"; // unreliable! /* foo = "bar"; Commented out, not working right now*/ );
1
Not sure these are the correct terms.
2
While it is not the primary purpose of this question, if anybody spots an error in my comment-regex, please let me know!
In reply to Handling Comments with Parse::RecDescent by crashtest
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |