Block-structured language parsing using a Perl module?

BrowserUk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Block-structured language parsing using a Perl module? by Illuminatus (Curate) on Aug 14, 2012 at 18:08 UTC
I can't really help you directly, but I would contact the author of Parse::Eyapp, if you haven't already. I know it's pretty new, but he must have had a reason to expand Yapp, and his changes look pretty extensive. fnord	[reply]
Re: Block-structured language parsing using a Perl module? by Anonymous Monk on Aug 15, 2012 at 03:05 UTC
I'm not sure, does Marpa::HTML qualify? YAPE::HTML? http://hop.perl.plover.com/linogram/? Speaking of Parse::Eyapp :) Parsing Strings and Trees with Parse::Eyapp (An Introduction to Compiler Construction) Sorry I couldn't help :)	[reply]
Re^2: Block-structured language parsing using a Perl module? by BrowserUk (Patriarch) on Aug 15, 2012 at 04:07 UTC
The problem with those is they are either: hand-crafted parsers constructed to parse a specific language (HTML). These are no use because I'm looking for a parser constructor module. or: examples, of using the parser constructor module to construct a parser to parse some more or less complicated language, written by the author of the module that does the construction. It is unsurprising that the author of a given module is motivated enough, and reasonably adept at using his own module, to persist in getting something moderately complicated to work. But can anyone else? If I could find an example of a parser module being used a) in a real-world project; b) of reasonable complexity; c) by some one other than its author; it would give some level of confidence that the module stands up to a) being learned; b) being debugged; c) being maintained in a timely fashion when bugs discovered through real-world usage are reported. Of the 3 modules I've experimented with, they: had awful apis -- large, complicated, verbose -- with lousy documentation, often as not couched in so much academic/theoretical terminology as to be almost unintelligible. I want to use a parser; not learn about the theory behind them. gave almost useless error diagnostics when defining the grammar; and even worse diagnostics when given non-complaint source to parse. so ridiculously slow in operation that the are almost useless for real-world usage. produce parse tree so complicated you need to write another parser to process them. A can see I am going to end up writing my own; but given the richness of the modules on cpan, I hoped that there was one amongst them that might stand up to RW usage. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. The start of some sanity?	[reply]
Re^3: Block-structured language parsing using a Perl module? by Anonymous Monk on Aug 15, 2012 at 12:38 UTC
:) I know this probably doesn't qualify also (and you probably saw it) , but GraphViz2::Marpa is not by Marpa author :) though it is also accompanied by how-to article FWIW, Marpa guy does give some praise for his error diagnostics on his blog :)	[reply]
Re^4: Block-structured language parsing using a Perl module? by BrowserUk (Patriarch) on Aug 15, 2012 at 14:42 UTC
Re^3: Block-structured language parsing using a Perl module? by chromatic (Archbishop) on Aug 16, 2012 at 23:03 UTC
I want to use a parser; not learn about the theory behind them. Part of the problem of lousy documentation is that you have to know enough theory to know both what type of parser you can use on a grammar and if your grammar is even parsable. Because semi-structured text can vary so much in the structure and meaning, the best any general purpose grammar engine can do is push back on you a little bit to figure out whether your language is a regular language, whether you need lookahead and how much, and how you handle things like recursion, if at all. Also a lot of the theoretical work comes from the world of linguistics, which is messy on its own. I agree about lousy APIs though. I can't speak about the performance of Regexp::Grammars, but if I were doing something like this, I'd start there for ease of use. I'd use Marpa for speed and completeness.	[reply]
Re^4: Block-structured language parsing using a Perl module? by BrowserUk (Patriarch) on Aug 17, 2012 at 07:22 UTC
Re^5: Block-structured language parsing using a Perl module? by Anonymous Monk on Aug 17, 2012 at 22:27 UTC
Some notes below your chosen depth have not been shown here
Re: Block-structured language parsing using a Perl module? by tobyink (Canon) on Aug 17, 2012 at 07:54 UTC
I don't know how you'd qualify a "full-featured, block-structured language", but I've used Parse::RecDescent to write a parser for OWL Functional Syntax which is a language that has identifiers and structures which nest to arbitrary depths. (Here's an example.) The grammar is in RDF::Trine::Parser::OwlFn::Grammar; a pre-compiled version of the grammar (compiled by Parse::RecDescent) is at RDF::Trine::Parser::OwlFn::Compiled; the class RDF::Trine::Parser::OwlFn provides the end-user API for it. Also of interest might be CSS::Parse::Heavy/CSS::Parse::Compiled. `perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'`	[reply]
Re^2: Block-structured language parsing using a Perl module? by BrowserUk (Patriarch) on Aug 18, 2012 at 18:33 UTC
Thank you tobyink. Those are both fine examples of the information I was looking for. (It's a shame that the metacpan site sends my browser (Opera) off into la-la land, but that's not your problem :) For me, the most telling files are the "compiled" grammars: OwlFn & CSS. I realise these are generated files, but damn are they ever resource hungry. It is no wonder that P::RD is so slow. Dog-forbid that either of you authors ever has to go plugging around inside there in order to solve a problem. Have you ever had occasion to measure the performance of your parser? (Do OwlFn source files ever get big enough that it is a concern?) With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. The start of some sanity?	[reply]
Re: Block-structured language parsing using a Perl module? by thargas (Deacon) on Aug 16, 2012 at 17:43 UTC
The Damian's "Advanced Perl Parsing" (see http://www.csse.monash.edu.au/~damian/Perl/DPW/AdvancedPerlParsing.pdf) gives a topic "Nearly Parsing Perl" which sounds like it might be a serious enough example. You don't say which ones you've tried.	[reply]
Re^2: Block-structured language parsing using a Perl module? by BrowserUk (Patriarch) on Aug 16, 2012 at 20:35 UTC
Thanks for the link. I've pulled the pdf and will give it a read over the next few days. Though I do so with a great deal of skeptisism. P::RD (used to?) commits every one of my cardinal sins: horrible API; lousy documentation; useless diagnostics; glacial performance; Maybe Regexp::Grammars does better, but on a cursory inspection, I do not hold out much hope :( With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. The start of some sanity?	[reply]
Re^3: Block-structured language parsing using a Perl module? by thargas (Deacon) on Aug 17, 2012 at 18:01 UTC
I don't know a lot about it. I did have to deal with it once in a program which took commands in an sql-like syntax and we had it pre-compile the the grammar and save it instead of compiling it on each load, which did make a difference IIRC. It was a while ago.	[reply]
Re^2: Block-structured language parsing using a Perl module? by Anonymous Monk on Aug 16, 2012 at 18:18 UTC
Thanks for that pdf (new to me), but according to the OP, that pdf doesn't qualify :)(its by the author) OTOH, UMMF isn't by theDamian. And FWIW Plosurin (among other) makes use of TheDamians new horse Regexp::Grammars ( RFC: A walkthrough from JSON ABNF to Regexp::Grammars)	[reply]
A reply falls below the community's threshold of quality. You may see it by logging in.
A reply falls below the community's threshold of quality. You may see it by logging in.