in reply to Re: Perl not BNF-able??
in thread Perl not BNF-able??

True, but although such ambiguities become apparent to the programmer, they do not stop us from expressing whichever of the choices perl actually makes in BNF.

One world, one people

Replies are listed 'Best First'.
Re^3: Perl not BNF-able??
by spurperl (Priest) on Jul 01, 2005 at 10:09 UTC
    BNF is by definition for context free grammars. Perl's grammar is not context free.
Re: Perl not BNF-able??
by jonadab (Parson) on Jul 01, 2005 at 11:10 UTC
    such ambiguities ... do not stop us from expressing whichever of the choices perl actually makes in BNF.

    Yeah? Try it.

    The problem is that in order to resolve the ambiguity (i.e., in order to select the choice perl actually makes, as you put it) you have to have access to information about the surrounding context, information that gets lost when reducing the surrounding syntax to BNF.

    s ;; split / , / ;; s ,$, ; , ; print eval;

    Don't make me turn this into a self-modifying quine.

    Incidentally, a lot of Lisp or Scheme programmers recoil in horror when they find out that Perl's grammar is not context-free. "If it's not possible to determine what a given syntactic construct means independent of context, how could anyone ever keep track of what a piece of a program is doing?" Well, that's the problem BNF has. In practice, this is not generally a problem for programmers who know the language, but it gives overly simple parsers fits.

      The problem with 'Try it' is that it's an all or nothing proposition. For lisp, the BNF is small enough not to worry about that. I can see that you can't wrap up a piece of non-context-free syntax in isolation. However, what I expected was that it might be possible to describe different syntax-elements for each type of contextual variation.

      I had a bad feeling about eval from the start, but I will still need to see where the process breaks down when worked through from first principles before I understand exactly what will go wrong (assuming the context-free issue is insurmountable, that is).

      One world, one people

        A little warning. The yacc parser config in perl (perly.y) is 763 lines long, while the lexer (toke.c) is 10998 lines long. That should be a clue as to how much special-case code perl needs that the parser can't handle itself.

        Dave.

        For lisp, the BNF is small enough not to worry about that

        Well, yeah, lisp barely even *has* syntax. That's the opposite end of the spectrum.

        I can see that you can't wrap up a piece of non-context-free syntax in isolation. However, what I expected was that it might be possible to describe different syntax-elements for each type of contextual variation.

        Yes, in principle. Perl can be parsed, obviously, because perl does parse it, QED. But the question is, what is required to make that happen? Perl has both left-associative and right-associative operators, for instance (although Perl5 is less whole-hog in this regard than Perl6 will be). Perhaps the larger issue, though, is that there are various things that can change the rules for how to parse other things, and they don't all come right before the thing whose interpretation they change. Just for one little example, consider the /x modifier, which changes the interpretation of whitespace and number signs that occur in the middle of the regex, before it. For a Turing-complete parser, this is not a problem, because you just scan ahead and see what's coming. I'm not a BNF guru, but if it has a concept of lookahead, I'm not aware of it.

        Another problem is that the way BNF likes to handle things is to lump cases together and treat them as the same, but this presents serious problems when parsing Perl code, especially as regards the various quoting constructs, *especially* the quoting constructs related to pattern matching, because of the number of special cases. In the example above, a typical BNF approach would be to lump the allowed regular expression modifiers together (by defining a rule that can match any of them and then using that rule at given points in various larger rules), but as the example above illustrates, you can't do that, because a regular expression modifier can change the rules for parsing the regular expression that precedes it. It also can change the interpretation of what is parsed; /s and /m do that. BNF is not troubled by that wrinkle, since it is not concerned as much with semantics as it is with syntax -- but that is BNF's whole problem when it comes to Perl, because the syntax and the semantics are not fully separable: the parser needs (some) knowledge of the semantics in order to sort out the syntax.


        "In adjectives, with the addition of inflectional endings, a changeable long vowel (Qamets or Tsere) in an open, propretonic syllable will reduce to Vocal Shewa. This type of change occurs when the open, pretonic syllable of the masculine singular adjective becomes propretonic with the addition of inflectional endings."  — Pratico & Van Pelt, BBHG, p68

        Oh, one other thing...

        I had a bad feeling about eval from the start

        If you think eval is problematic, consider the /e substitution modifier, which may be combined with nonstandard delimiters...

        s ((foo))[{ some_possibly_lengthy_chunk_of_code_here(); somefunc($1) }]ex;