in reply to Re^2: regex issue
in thread regex issue

Don't trust the syntax highlighting in your text editor as an authority on how perl itself will treat a script.

Perl has been proven to be impossible to parse. (If you define "parse" to mean "determine the structure of without executing it". Clearly it is possible to determine the structure of Perl code if you actually execute it.) Text editors do their best, but sometimes fall short.

The editors that tend to do the best highlighting for Perl in my experience are Padre and SciTE. With SciTE, the only Perl syntax that seems to consistently confuse it is:

sub uppercase ($) { return uc $_[0]; }

(Yes, I'm well-aware that this is a useless function. It's just an example.) SciTE will highlight the $) in the prototype as if it were the $) built-in EGID variable.

Replies are listed 'Best First'.
Re^4: regex issue
by JavaFan (Canon) on Feb 17, 2012 at 10:54 UTC
    Perl has been proven to be impossible to parse.
    Yeah, that's why PPI is such a waste of time. Perlcritic and perltidy is just stabbing in the dark. And the perl-to-javascript compiler(*) that was shown at FOSDEM was just a scam. ;-)
    Text editors do their best, but sometimes fall short.
    Text editors have much different problem. They have to be 1) interactive (which means, instant response), and 2) work on code in progress -- most of the time, not even perl can parse the text, because it isn't finished yet.

    (*) Yes, I know it isn't complete yet. Give Flavio some time.

      Perl is impossible to parse in the same sense that it's impossible to determine whether an arbitrary program halts.

      Given the following Perl script:

      #!/usr/bin/perl print("1\n");

      It should certainly be possible to parse it (without executing it), and it should also be possible to detect that it will halt.

      However, a parser cannot be written that will take any arbitrary valid Perl scripts as input, and always produce a parse tree as output without executing the program.

      PPI can parse a very large subset of Perl scripts. It does so very well, but there will always be some scripts it simply can't decide. The canonical example is:

      whatever / 25 ; # / ; die "this dies!";

      Which can be parsed two very different ways depending on the prototype of whatever. If it has a prototype of () then it takes no arguments, so it's interpreted as the following, plus a comment:

      whatever() / 25;

      If whatever has a prototype of ($), so takes an argument, then it is interpreted as:

      whatever($_ =~ m{ 25 ; # }); die "this dies!";

      If the prototype of whatever is determined at runtime, e.g.:

      BEGIN { *sum = sub ($$) { (shift) + (shift) }; *whatever = (sum(2,2) == 5) ? sub ($) {} : sub () {}; }

      then the Perl cannot be parsed without executing part of it. (The parser needs to call the sub sum.)

      Which is not to say that PPI and the other fine projects you mention are without value. Parsing a large subset of Perl is still very useful. Having a large subset of a fortune, is better than having no money at all.

        Perl is impossible to parse in the same sense that it's impossible to determine whether an arbitrary program halts.
        Exactly. Which means that in most cases, it is possible (see PPI, see perlcritic, see perltidy, see the Javascript compiler, see the syntax highlighters). The problem the OP was facing with his syntax highlighting had nothing to whether they may exist a program that cannot be fully parsed with executing some of it.