First, let us realise that your examples don't actually perform a ->read(\$s), but we'll just assume it's somewhere in there.

Secondly, let's be clear about the behaviour of read() and value(): read() returns the position in the input stream where scanning ended. It will throw an error if the input couldn't be parsed e.g. because no expected lexeme matched the input at the current location. It will not throw an error if the input simply has been exhausted. That is, it only throws an error if there was a problem with respect to the low-level grammar which represents the lexer.
The value() method will return either undef if there was no successful parse, or a reference to the result if the parse was successful. As you do not currently specify any semantics, the result would be undef, so we'd end up with a reference to undef in the case of a successful parse. That is, this indicates the success of the high-level grammar which does the actual parsing.

Now let's look at the first example. Here, the input is accepted by the low-level grammar, as the token '12' can be matched. Therefore, read() succeeds. However, the high-level grammar also requires the '3' token, which the input doesn't contain. As the input has ended, the high-level grammar will fail, and value() will be undef, indicating a parse failure.

Now the second example. Here, the low-level grammar provides the lexemes many_a ~ [A]+ and the anonymous 'A'. Due to longest-token matching, the input 'AA' will be completely consumed by the many_a lexeme. This token is accepted by the high-level grammar. Now the input is exhausted, so the token 'A' will not be found.

For efficiency, Marpa never backtracks like Perl regexes do. Especially, Marpa's low-level grammar is equivalent to regular expressions (the computer science thing, not Perl regexes), which can be recognized in linear time. I.e., this is efficient as hell, but not very expressive. If you need more features, you must use the high-level grammar and resolve the ambiguity there, or switch to manual lexing (via pause events … but you'll come to that later).

If you want a successful parse to have a more interesting value(), just add a :default ::= action => [name,values] to your grammar, which will create an array ref for each rule.


In reply to Re: Marpa -- Partial grammar match does not result in failure by amon
in thread Marpa -- Partial grammar match does not result in failure by tj_thompson

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.