Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

The (Larry) Wall Series

by Jeffrey Kegler (Hermit)
on Nov 08, 2007 at 23:19 UTC ( [id://649825]=perlmeditation: print w/replies, xml ) Need Help??

I'm testing a parser which parses ambiguous grammars (it'll eventually wind up in CPAN under the name Parse::Marpa). In searching for good test cases (criteria: small enough to debug, tricky enough to find bugs and if possible, fun), I've bumped into what I'll call the (Larry) Wall Series.

Consider the following grammar, which is both a restriction and an extension of Perl: The restriction is that only numbers and minus signs are allowed. The extensions are that precedence is ignored; that lvalues are automatically created as needed; and that ambiguity is allowed. Ambigious expressions are evaluated as a list of all the possible results. (Apparently something like this will be an option with Perl 6 regexes, which encourages me to hope my thinking is not too warped, or at least that I have company.)

So, for example, the expression "6----1" produces the list (7, 8, 6, 7) as follows:

((6--)-(-1))==7 (6-(--(-1)))==8 (6-(-(--1)))==6 (6-(-(-(-1))))==7
Putting a number at each end and the minus signs in the middle produces the greatest ambiguity, and is therefore the most interesting case. Given such an expression with n minus signs, what's the number of parses? Parse::Marpa produces this series: 1 1 3 4 8 12 21 33 55 88 144 232. You'll note this is similar to, but not quite, the Fibonacci series. It's a series known to mathematicians, but does not seem to have a name, so I'm taking the liberty of calling it the Wall Series. I've checked the first 5 by hand. For the curious here are the other parse results for the first 12 numbers in the Wall series.
Length 1, Parse 0, value==(6-1)==5 Length 2, Parse 0, value==(6-(-1))==7 Length 3, Parse 0, value==((6--)-1)==5 Length 3, Parse 1, value==(6-(--1))==6 Length 3, Parse 2, value==(6-(-(-1)))==5 Length 4, Parse 0, value==((6--)-(-1))==7 Length 4, Parse 1, value==(6-(--(-1)))==8 Length 4, Parse 2, value==(6-(-(--1)))==6 Length 4, Parse 3, value==(6-(-(-(-1))))==7 Length 5, Parse 0, value==(((6--)--)-1)==5 Length 5, Parse 1, value==((6--)-(--1))==6 Length 5, Parse 2, value==(6-(--(--1)))==7 Length 5, Parse 3, value==(6-(--(-(-1))))==6 Length 5, Parse 4, value==((6--)-(-(-1)))==5 Length 5, Parse 5, value==(6-(-(-(--1))))==6 Length 5, Parse 6, value==(6-(-(-(-(-1)))))==5 Length 5, Parse 7, value==(6-(-(--(-1))))==4 Length 6, Parse 0, value==(((6--)--)-(-1))==7 Length 6, Parse 1, value==((6--)-(--(-1)))==8 Length 6, Parse 2, value==((6--)-(-(--1)))==6 Length 6, Parse 3, value==((6--)-(-(-(-1))))==7 Length 6, Parse 4, value==(6-(--(--(-1))))==9 Length 6, Parse 5, value==(6-(--(-(--1))))==7 Length 6, Parse 6, value==(6-(--(-(-(-1)))))==8 Length 6, Parse 7, value==(6-(-(--(--1))))==5 Length 6, Parse 8, value==(6-(-(--(-(-1)))))==6 Length 6, Parse 9, value==(6-(-(-(--(-1)))))==8 Length 6, Parse 10, value==(6-(-(-(-(--1)))))==6 Length 6, Parse 11, value==(6-(-(-(-(-(-1))))))==7 Length 7, Parse 0, value==((((6--)--)--)-1)==5 Length 7, Parse 1, value==(((6--)--)-(--1))==6 Length 7, Parse 2, value==((6--)-(--(--1)))==7 Length 7, Parse 3, value==((6--)-(--(-(-1))))==6 Length 7, Parse 4, value==(((6--)--)-(-(-1)))==5 Length 7, Parse 5, value==(6-(--(--(--1))))==8 Length 7, Parse 6, value==(6-(--(--(-(-1)))))==7 Length 7, Parse 7, value==(6-(--(-(-(--1)))))==7 Length 7, Parse 8, value==(6-(--(-(-(-(-1))))))==6 Length 7, Parse 9, value==(6-(--(-(--(-1)))))==5 Length 7, Parse 10, value==((6--)-(-(-(--1))))==6 Length 7, Parse 11, value==((6--)-(-(-(-(-1)))))==5 Length 7, Parse 12, value==((6--)-(-(--(-1))))==4 Length 7, Parse 13, value==(6-(-(-(--(--1)))))==7 Length 7, Parse 14, value==(6-(-(-(--(-(-1))))))==6 Length 7, Parse 15, value==(6-(-(-(-(-(--1))))))==6 Length 7, Parse 16, value==(6-(-(-(-(-(-(-1)))))))==5 Length 7, Parse 17, value==(6-(-(-(-(--(-1))))))==4 Length 7, Parse 18, value==(6-(-(--(-(--1)))))==5 Length 7, Parse 19, value==(6-(-(--(-(-(-1))))))==4 Length 7, Parse 20, value==(6-(-(--(--(-1)))))==3 Length 8, Parse 0, value==((((6--)--)--)-(-1))==7 Length 8, Parse 1, value==(((6--)--)-(--(-1)))==8 Length 8, Parse 2, value==(((6--)--)-(-(--1)))==6 Length 8, Parse 3, value==(((6--)--)-(-(-(-1))))==7 Length 8, Parse 4, value==((6--)-(--(--(-1))))==9 Length 8, Parse 5, value==((6--)-(--(-(--1))))==7 Length 8, Parse 6, value==((6--)-(--(-(-(-1)))))==8 Length 8, Parse 7, value==((6--)-(-(--(--1))))==5 Length 8, Parse 8, value==((6--)-(-(--(-(-1)))))==6 Length 8, Parse 9, value==((6--)-(-(-(--(-1)))))==8 Length 8, Parse 10, value==((6--)-(-(-(-(--1)))))==6 Length 8, Parse 11, value==((6--)-(-(-(-(-(-1))))))==7 Length 8, Parse 12, value==(6-(--(--(--(-1)))))==10 Length 8, Parse 13, value==(6-(--(--(-(--1)))))==8 Length 8, Parse 14, value==(6-(--(--(-(-(-1))))))==9 Length 8, Parse 15, value==(6-(--(-(--(--1)))))==6 Length 8, Parse 16, value==(6-(--(-(--(-(-1))))))==7 Length 8, Parse 17, value==(6-(--(-(-(--(-1))))))==9 Length 8, Parse 18, value==(6-(--(-(-(-(--1))))))==7 Length 8, Parse 19, value==(6-(--(-(-(-(-(-1)))))))==8 Length 8, Parse 20, value==(6-(-(--(--(--1)))))==4 Length 9, Parse 0, value==(((((6--)--)--)--)-1)==5 Length 9, Parse 1, value==((((6--)--)--)-(--1))==6 Length 9, Parse 2, value==(((6--)--)-(--(--1)))==7 Length 9, Parse 3, value==(((6--)--)-(--(-(-1))))==6 Length 9, Parse 4, value==((((6--)--)--)-(-(-1)))==5 Length 9, Parse 5, value==((6--)-(--(--(--1))))==8 Length 9, Parse 6, value==((6--)-(--(--(-(-1)))))==7 Length 9, Parse 7, value==((6--)-(--(-(-(--1)))))==7 Length 9, Parse 8, value==((6--)-(--(-(-(-(-1))))))==6 Length 9, Parse 9, value==((6--)-(--(-(--(-1)))))==5 Length 9, Parse 10, value==(((6--)--)-(-(-(--1))))==6 Length 9, Parse 11, value==(((6--)--)-(-(-(-(-1)))))==5 Length 9, Parse 12, value==(((6--)--)-(-(--(-1))))==4 Length 9, Parse 13, value==(6-(--(--(--(--1)))))==9 Length 9, Parse 14, value==(6-(--(--(--(-(-1))))))==8 Length 9, Parse 15, value==(6-(--(--(-(-(--1))))))==8 Length 9, Parse 16, value==(6-(--(--(-(-(-(-1)))))))==7 Length 9, Parse 17, value==(6-(--(--(-(--(-1))))))==6 Length 9, Parse 18, value==(6-(--(-(-(--(--1))))))==8 Length 9, Parse 19, value==(6-(--(-(-(--(-(-1)))))))==7 Length 9, Parse 20, value==(6-(--(-(-(-(-(--1)))))))==7 Length 10, Parse 0, value==(((((6--)--)--)--)-(-1))==7 Length 10, Parse 1, value==((((6--)--)--)-(--(-1)))==8 Length 10, Parse 2, value==((((6--)--)--)-(-(--1)))==6 Length 10, Parse 3, value==((((6--)--)--)-(-(-(-1))))==7 Length 10, Parse 4, value==(((6--)--)-(--(--(-1))))==9 Length 10, Parse 5, value==(((6--)--)-(--(-(--1))))==7 Length 10, Parse 6, value==(((6--)--)-(--(-(-(-1)))))==8 Length 10, Parse 7, value==(((6--)--)-(-(--(--1))))==5 Length 10, Parse 8, value==(((6--)--)-(-(--(-(-1)))))==6 Length 10, Parse 9, value==(((6--)--)-(-(-(--(-1)))))==8 Length 10, Parse 10, value==(((6--)--)-(-(-(-(--1)))))==6 Length 10, Parse 11, value==(((6--)--)-(-(-(-(-(-1))))))==7 Length 10, Parse 12, value==((6--)-(--(--(--(-1)))))==10 Length 10, Parse 13, value==((6--)-(--(--(-(--1)))))==8 Length 10, Parse 14, value==((6--)-(--(--(-(-(-1))))))==9 Length 10, Parse 15, value==((6--)-(--(-(--(--1)))))==6 Length 10, Parse 16, value==((6--)-(--(-(--(-(-1))))))==7 Length 10, Parse 17, value==((6--)-(--(-(-(--(-1))))))==9 Length 10, Parse 18, value==((6--)-(--(-(-(-(--1))))))==7 Length 10, Parse 19, value==((6--)-(--(-(-(-(-(-1)))))))==8 Length 10, Parse 20, value==((6--)-(-(--(--(--1)))))==4 Length 11, Parse 0, value==((((((6--)--)--)--)--)-1)==5 Length 11, Parse 1, value==(((((6--)--)--)--)-(--1))==6 Length 11, Parse 2, value==((((6--)--)--)-(--(--1)))==7 Length 11, Parse 3, value==((((6--)--)--)-(--(-(-1))))==6 Length 11, Parse 4, value==(((((6--)--)--)--)-(-(-1)))==5 Length 11, Parse 5, value==(((6--)--)-(--(--(--1))))==8 Length 11, Parse 6, value==(((6--)--)-(--(--(-(-1)))))==7 Length 11, Parse 7, value==(((6--)--)-(--(-(-(--1)))))==7 Length 11, Parse 8, value==(((6--)--)-(--(-(-(-(-1))))))==6 Length 11, Parse 9, value==(((6--)--)-(--(-(--(-1)))))==5 Length 11, Parse 10, value==((((6--)--)--)-(-(-(--1))))==6 Length 11, Parse 11, value==((((6--)--)--)-(-(-(-(-1)))))==5 Length 11, Parse 12, value==((((6--)--)--)-(-(--(-1))))==4 Length 11, Parse 13, value==((6--)-(--(--(--(--1)))))==9 Length 11, Parse 14, value==((6--)-(--(--(--(-(-1))))))==8 Length 11, Parse 15, value==((6--)-(--(--(-(-(--1))))))==8 Length 11, Parse 16, value==((6--)-(--(--(-(-(-(-1)))))))==7 Length 11, Parse 17, value==((6--)-(--(--(-(--(-1))))))==6 Length 11, Parse 18, value==((6--)-(--(-(-(--(--1))))))==8 Length 11, Parse 19, value==((6--)-(--(-(-(--(-(-1)))))))==7 Length 11, Parse 20, value==((6--)-(--(-(-(-(-(--1)))))))==7 Length 12, Parse 0, value==((((((6--)--)--)--)--)-(-1))==7 Length 12, Parse 1, value==(((((6--)--)--)--)-(--(-1)))==8 Length 12, Parse 2, value==(((((6--)--)--)--)-(-(--1)))==6 Length 12, Parse 3, value==(((((6--)--)--)--)-(-(-(-1))))==7 Length 12, Parse 4, value==((((6--)--)--)-(--(--(-1))))==9 Length 12, Parse 5, value==((((6--)--)--)-(--(-(--1))))==7 Length 12, Parse 6, value==((((6--)--)--)-(--(-(-(-1)))))==8 Length 12, Parse 7, value==((((6--)--)--)-(-(--(--1))))==5 Length 12, Parse 8, value==((((6--)--)--)-(-(--(-(-1)))))==6 Length 12, Parse 9, value==((((6--)--)--)-(-(-(--(-1)))))==8 Length 12, Parse 10, value==((((6--)--)--)-(-(-(-(--1)))))==6 Length 12, Parse 11, value==((((6--)--)--)-(-(-(-(-(-1))))))==7 Length 12, Parse 12, value==(((6--)--)-(--(--(--(-1)))))==10 Length 12, Parse 13, value==(((6--)--)-(--(--(-(--1)))))==8 Length 12, Parse 14, value==(((6--)--)-(--(--(-(-(-1))))))==9 Length 12, Parse 15, value==(((6--)--)-(--(-(--(--1)))))==6 Length 12, Parse 16, value==(((6--)--)-(--(-(--(-(-1))))))==7 Length 12, Parse 17, value==(((6--)--)-(--(-(-(--(-1))))))==9 Length 12, Parse 18, value==(((6--)--)-(--(-(-(-(--1))))))==7 Length 12, Parse 19, value==(((6--)--)-(--(-(-(-(-(-1)))))))==8 Length 12, Parse 20, value==(((6--)--)-(-(--(--(--1)))))==4

Replies are listed 'Best First'.
Re: The (Larry) Wall Series
by Prof Vince (Friar) on Nov 09, 2007 at 08:11 UTC
    You'll note this is similar to, but not quite, the Fibonacci series. It's a series known to mathematicians, but does not seem to have a name, so I'm taking the liberty of calling it the Wall Series.
    Do not mistake numerology for maths. If we had to start giving a name to every single sequence, we'd run out of names very quickly.

    This sequence "looks like" Fibonacci's because it should be solution of a similar linear recurrence equation.
      The recurrence equation is a(n) = 2*a(n-2) + a(n-3) + 1. It's in the literature as A052952. I take "numerology" to suggest a pseudo-relationship based on conjecture from a small sample. That is, the first few numbers may match a series, but how do I know later numbers will actually match the series? I'll present a proof below that the parse counts generated from the Minus Sign Grammar are precisely A052952, and I hope that will take us out of the realm of numerology.

      First, a comment on why this is useful and/or interesting. Many monks will have heard of the Ackermann function. I don't know of any direct applications for it, but it gets a lot of use in testing because 1) it's an extreme example of a kind of computation that is practical; and 2) it focuses on a functionality that is useful.

      In investigating parsing of ambiguous grammars, I wanted an extreme example of the kind of grammar likely to be encountered in practice. "In practice" and "Perl" aren't quite synonyms, but they're close. My problem with Perl grammar was that it's (of course) unambiguous. So I eliminated all notions of precedence. Also, I did away with lvalue restrictions. That is, I legalized 6++, treating it as roughly equivalent to ($tmpNN=6)++.

      Honing in on the part of Perl most likely to generate lots of ambiguity, I focused on expressions with a number at each end and minus signs between them. Four operators are composed entirely of minus signs: pre-decrement, post-decrement, arithmetic negation and subtraction. The resulting grammar is

      E ::= Number | E-- | --E | -E | E-E

      In testing an ambigious parser, you want to be sure it gets all the parses. Counting by hand gets you only so far. Call the parse count for a string with n minus signs, W(n). At n=5, W(n)=8 and the counting is already quite difficult to do in your head:

      W(10)=144, W(20)=17711 and W(36)=39088169 so even for small n, we're in territory where hand counts just ain't gonna happen. Hence the importance of identifying the W(n) series with a mathematical series. I can simply count the parses generated by my algorithm and look up the appropriate number in the series to see if they match.

      Yes, you say, all very nice, but since you only hand checked up to 5, isn't this numerology? What reason is there to believe that the number of parses generated, W(n), actually matches A052952?

      A proof follows. Those who think they might have any taste for this sort of thing might want to at least attempt to follow it, since it's not what the mathematicians call "technical" and is presented in Perlish terms.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://649825]
Approved by moritz
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (5)
As of 2024-04-19 13:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found