tj_thompson has asked for the wisdom of the Perl Monks concerning the following question:
Hello monks,
My ongoing Marpa adventures bring me back with some basic questions. I have traced it back to this test case:
use strict; use warnings; use Marpa::R2; use Data::Dumper; # Example 1 { my $g_str = <<'END_GRAMMAR'; :start ::= data data ::= '12' '3' END_GRAMMAR my $g_obj = Marpa::R2::Scanless::G->new({ source => \$g_str }); my $p_obj = Marpa::R2::Scanless::R->new({ grammar => $g_obj, trace_values => 1, trace_terminals => 1 }); my $s = '12'; $p_obj->read( \$s ); print "EXAMPLE 1 VALUE:".Dumper($p_obj->value)."\n"; } # Example 2 { my $g_str = <<'END_GRAMMAR'; :start ::= data data ::= many_a 'A' many_a ~ [A]+ END_GRAMMAR my $g_obj = Marpa::R2::Scanless::G->new({ source => \$g_str }); my $p_obj = Marpa::R2::Scanless::R->new({ grammar => $g_obj, trace_values => 1, trace_terminals => 1 }); my $s = 'AA'; $p_obj->read( \$s ); print "EXAMPLE 2 VALUE:".Dumper($p_obj->value)."\n"; }
In example 1, given that the input string is only '12', but the grammar specifies two tokens ('12' and '3'), I would expect the $p_obj->read call to fail to parse. However, no failure occurs.
I do see the first lexeme being accepted in the trace output. But I do not see the second '3' lexeme being accepted or rejected. The $p_obj->value result is undefined, which would make sense if the read did not complete.
If a '3' is appended to the input string, the $p_obj->value call then shows the expected return value.
In example 2, the first lexeme that will accept multiple 'A' characters matches both, leaving none for the second lexeme requiring a single 'A' character. The end result here seems to be the same. The $p_obj->read call does not fail, but the value ends up as undef.
This seems to imply a greedy match on the first lexeme that will not allow the second lexeme to match. However, I would again expect a grammar failure during the read call.
So on to my questions.
As always, many thanks for the time and insight :)
EDIT: added in the read calls I left out from my copy and paste as Amon pointed out below.
EDIT2: As this question is very Marpa grammar/behavior specific, and less about Perl, I've posted a my below variant of this question based on feedback to the Google Marpa group here: https://groups.google.com/forum/#!topic/marpa-parser/fZzhxdBDbGk. I sincerely apologize if this is considered bad posting etiquette, but at this point I believe it's probably the more appropriate forum for the question. I will monitor both and ensure what I learn there is posted here in hopes that it will help others.
EDIT3: See the above link. Turns out assuming the input string matched the grammar because the read method did not throw an error is incorrect.
EDIT4: As for example 2 above, Jeffrey was able to offer some insight again. The lexer is greedy by nature (rules defined with '~'). The structural rules (defined with '::=' are much better at handling ambiguity. His solution was to move part of the logic into a structural rule. See post here https://groups.google.com/forum/#!topic/marpa-parser/6jgQj-MOLGM
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Marpa -- Partial grammar match does not result in failure
by amon (Scribe) on May 08, 2014 at 01:45 UTC | |
by tj_thompson (Monk) on May 08, 2014 at 04:33 UTC | |
by tj_thompson (Monk) on May 08, 2014 at 18:16 UTC | |
by tj_thompson (Monk) on May 08, 2014 at 16:50 UTC | |
|
Re: Marpa -- Partial grammar match does not result in failure
by BrowserUk (Patriarch) on May 08, 2014 at 00:41 UTC | |
by amon (Scribe) on May 08, 2014 at 01:54 UTC | |
|
Re: Marpa -- Partial grammar match does not result in failure
by locked_user sundialsvc4 (Abbot) on May 08, 2014 at 12:38 UTC |