Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re^2: Perl not BNF-able??

by anonymized user 468275 (Curate)
on Jul 01, 2005 at 11:45 UTC ( [id://471667] : note . print w/replies, xml ) Need Help??


in reply to Re: Perl not BNF-able??
in thread Perl not BNF-able??

The problem with 'Try it' is that it's an all or nothing proposition. For lisp, the BNF is small enough not to worry about that. I can see that you can't wrap up a piece of non-context-free syntax in isolation. However, what I expected was that it might be possible to describe different syntax-elements for each type of contextual variation.

I had a bad feeling about eval from the start, but I will still need to see where the process breaks down when worked through from first principles before I understand exactly what will go wrong (assuming the context-free issue is insurmountable, that is).

One world, one people

Replies are listed 'Best First'.
Re^3: Perl not BNF-able??
by dave_the_m (Monsignor) on Jul 01, 2005 at 12:30 UTC
    A little warning. The yacc parser config in perl (perly.y) is 763 lines long, while the lexer (toke.c) is 10998 lines long. That should be a clue as to how much special-case code perl needs that the parser can't handle itself.

    Dave.

      I was in a conversation the other day where Larry said that the reason he did that was to be able to say that Perl 5 had a smaller parser than did Perl 4. Now that he's refactoring the tokenizer, I think he regrets that choice somewhat.

Re: Perl not BNF-able??
by jonadab (Parson) on Jul 02, 2005 at 21:58 UTC
    For lisp, the BNF is small enough not to worry about that

    Well, yeah, lisp barely even *has* syntax. That's the opposite end of the spectrum.

    I can see that you can't wrap up a piece of non-context-free syntax in isolation. However, what I expected was that it might be possible to describe different syntax-elements for each type of contextual variation.

    Yes, in principle. Perl can be parsed, obviously, because perl does parse it, QED. But the question is, what is required to make that happen? Perl has both left-associative and right-associative operators, for instance (although Perl5 is less whole-hog in this regard than Perl6 will be). Perhaps the larger issue, though, is that there are various things that can change the rules for how to parse other things, and they don't all come right before the thing whose interpretation they change. Just for one little example, consider the /x modifier, which changes the interpretation of whitespace and number signs that occur in the middle of the regex, before it. For a Turing-complete parser, this is not a problem, because you just scan ahead and see what's coming. I'm not a BNF guru, but if it has a concept of lookahead, I'm not aware of it.

    Another problem is that the way BNF likes to handle things is to lump cases together and treat them as the same, but this presents serious problems when parsing Perl code, especially as regards the various quoting constructs, *especially* the quoting constructs related to pattern matching, because of the number of special cases. In the example above, a typical BNF approach would be to lump the allowed regular expression modifiers together (by defining a rule that can match any of them and then using that rule at given points in various larger rules), but as the example above illustrates, you can't do that, because a regular expression modifier can change the rules for parsing the regular expression that precedes it. It also can change the interpretation of what is parsed; /s and /m do that. BNF is not troubled by that wrinkle, since it is not concerned as much with semantics as it is with syntax -- but that is BNF's whole problem when it comes to Perl, because the syntax and the semantics are not fully separable: the parser needs (some) knowledge of the semantics in order to sort out the syntax.


    "In adjectives, with the addition of inflectional endings, a changeable long vowel (Qamets or Tsere) in an open, propretonic syllable will reduce to Vocal Shewa. This type of change occurs when the open, pretonic syllable of the masculine singular adjective becomes propretonic with the addition of inflectional endings."  — Pratico & Van Pelt, BBHG, p68
      Well, you can claim that Perl 6 is going whole hogger on context sensitivity than Perl 5, but the fact is that Perl 6 is cleaning up all those silly post-declarational switches you're carping about, and forcing them to be predeclarations (except for /x, which is going away entirely because it's mandatory). Left-to-right parsing is darn near mandatory in Perl 6. Lookahead dependencies are generally limited to one token. About the only exception I can think of offhand is deciding whether curlies are composing a hash or declaring a closure, and this is handled by thinking of both of them as closures, but then forcing an evaluation-time call on the closure if semantic analysis determines it's a hash composer.

      As for /e, that's also gone in favor of generalized expression interpolation.

        Well, you can claim that Perl 6 is going whole hogger on context sensitivity than Perl 5, but the fact is that Perl 6 is cleaning up all those silly post-declarational switches you're carping about

        I didn't mean to be carping. (I never considered BNF-reducability to be an interesting goal, much less an important one, for a language's development. I'm much more interested in whether Perl6 is powerful and expressive, and from what I've managed to absorb so far, it has those things in spades.) Nonetheless, now that you mention it, a number of the specific things I mentioned are indeed going away in Perl6. On the other hand, the grammar in Perl6 is mutable, so I rather doubt the Perl6 parser will be able to be 100% defined in terms of BNF, unless I'm missing something. (Which is possible. You know more about this subject than I do.)


        "In adjectives, with the addition of inflectional endings, a changeable long vowel (Qamets or Tsere) in an open, propretonic syllable will reduce to Vocal Shewa. This type of change occurs when the open, pretonic syllable of the masculine singular adjective becomes propretonic with the addition of inflectional endings."  — Pratico & Van Pelt, BBHG, p68
      This 471967 is the best explanation so far of why perl defies BNF although some other posts arguably provide better insight into the problems of parsing perl, which of course is also an important motivation for my OP.

      Yes I chose lisp in my comment as a contrary example because its BNF is about the simplest. Here is one version of it:

      http://cui.unige.ch/db-research/Enseignement/analyseinfo/LISP/BNFlisp.html

      One world, one people

        Just a comment on linking (tried this with /msg, but there's not enough characters there):

        Have you seen Link Shortcuts? I'm not sure, but the explicit URL you used may log some people out (depending on how they come into Perl Monks.).

        My understanding is the best way to do this is:

        [id://471967|Your Description Here]

        -QM
        --
        Quantum Mechanics: The dreams stuff is made of

Re: Perl not BNF-able??
by jonadab (Parson) on Jul 02, 2005 at 23:14 UTC

    Oh, one other thing...

    I had a bad feeling about eval from the start

    If you think eval is problematic, consider the /e substitution modifier, which may be combined with nonstandard delimiters...

    s ((foo))[{ some_possibly_lengthy_chunk_of_code_here(); somefunc($1) }]ex;