Help With Parsing and Commenting

EchoAngel has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Help With Parsing and Commenting by VSarkiss (Monsignor) on Jul 13, 2004 at 22:27 UTC
There's a few packages that might help, such Parse::Yapp and Parse::RecDescent. To use either of these you'll need a reasonable understanding of parser theory, such as what LR(1) means. Depending on the situation, you may be able to get away with some simple regular expressions, but if you're dealing with a language of any complexity (such as having comment characters embedded in strings), regexes quickly run out of steam.	[reply]
Re: Help With Parsing and Commenting by FoxtrotUniform (Prior) on Jul 13, 2004 at 23:18 UTC
Hello Monks, I usually have to parse a lot of data and usually the data would contain comments which should not be look at. Ie /* / or #. After parsing the data, I should remake a file with most of the modified data (with the original comments). What idea/functions can help me with this type of stuff? That's an awfully open-ended question. As VSarkiss suggested, parsing in general is a difficult problem (even the theory he cited solves a simplified version of the general problem -- context-dependent parsing, where things have different meanings depending on where they're used, is extremely difficult), so a general answer would take a lot of space (and time!). That said, not all parsing is hard. For instance, if your data look sort of like `key: value # comment` [download] your "parser" is going to be pretty trivial, something like `# NOTE: untested! while(<INPUT>) { my ($key, $val) = /^([^:]+):\s([\n#]+)/; &do_stuff_with($key, $val); }` [download] If your data are a bit more complex, you may still be able to describe them with a regular expression. Without going into too much detail, regexes can match data that don't depend on "nesting" or "counting". (Perl's "regexes" can, but they're actually more powerful, theoretically speaking, than what geeks like me call regular expressions.) Anything more complex than that, and you'll want a real parser and a lot more theory. So: what do your data look like? Update: Oops, forgot to mention something. The converse problem to parsing (turning a text file into some sort of data structure) is "pretty-printing" (turning some sort of data structure into a text file). Pretty-printing isn't usually considered to be as difficult as parsing (the hard part about parsing is extracting structure; when you're pretty-printing something, you know its structure), but you might run into problems replicating the comments: most parsers strip out comments from their input (since comments don't contribute to structure). `-- F o x t r o t U n i f o r m Found a typo in this node? /msg me % man 3 strfry`	[reply] [d/l] [select]