in reply to Re^4: regex issue
in thread regex issue
Perl is impossible to parse in the same sense that it's impossible to determine whether an arbitrary program halts.
Given the following Perl script:
#!/usr/bin/perl print("1\n");
It should certainly be possible to parse it (without executing it), and it should also be possible to detect that it will halt.
However, a parser cannot be written that will take any arbitrary valid Perl scripts as input, and always produce a parse tree as output without executing the program.
PPI can parse a very large subset of Perl scripts. It does so very well, but there will always be some scripts it simply can't decide. The canonical example is:
whatever / 25 ; # / ; die "this dies!";
Which can be parsed two very different ways depending on the prototype of whatever. If it has a prototype of () then it takes no arguments, so it's interpreted as the following, plus a comment:
whatever() / 25;
If whatever has a prototype of ($), so takes an argument, then it is interpreted as:
whatever($_ =~ m{ 25 ; # }); die "this dies!";
If the prototype of whatever is determined at runtime, e.g.:
BEGIN { *sum = sub ($$) { (shift) + (shift) }; *whatever = (sum(2,2) == 5) ? sub ($) {} : sub () {}; }
then the Perl cannot be parsed without executing part of it. (The parser needs to call the sub sum.)
Which is not to say that PPI and the other fine projects you mention are without value. Parsing a large subset of Perl is still very useful. Having a large subset of a fortune, is better than having no money at all.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^6: regex issue
by JavaFan (Canon) on Feb 17, 2012 at 12:30 UTC |