Re: Cannot get Marpa::R2 to prioritise one rule over another

Hello again,

maybe a second attempt is better than first one. I had to specify what an hostname is in an ugly way but seems viable.

I'm going mad to understand why the dot . is passed in for IPs and not for hostnames! (because ip ends with an action?)

#!/usr/bin/env perl
use warnings;
use strict;

use Data::Dump;
use Marpa::R2;

my $rules = <<'END_OF_GRAMMAR';
lexeme default = latm => 1
:default ::= action => ::first

entry           ::= op hostaddr4    action => dump_entry
op              ::= 'add'           action => add_op
                  | 'remove'        action => add_op
                  
hostaddr4       ::= hostname | ipv4


hostname        ::= DOMAIN  EXT               action => add_hostname
                  | DOMAIN DOMAIN EXT         action => add_hostname
                  | DOMAIN DOMAIN DOMAIN EXT  action => add_hostname

DOMAIN          ::= NAME '.'
NAME            ~ [\d\w]+
EXT             ~ 'org' | 'net'                 

ipv4            ::= NUMBER '.' NUMBER '.' NUMBER '.' NUMBER action => 
+add_ip                    
NUMBER          ~ [\d]+
                    
:discard        ~ SP
SP              ~ [\s]+    


END_OF_GRAMMAR

my $input = <<'END_OF_INPUT';
add example.org
add www.perl.org
add 42.perl.net
add 192.0.2.1
remove 192.0.2.2

END_OF_INPUT

my $grammar = Marpa::R2::Scanless::G->new({source => \$rules});

for (split /^/m, $input) {
    chomp;
    if (length $_) {
        print "\nPARSING: $_\n";
        
        my $recce = Marpa::R2::Scanless::R->new({
            grammar => $grammar, 
        });
        my $value_ref = $grammar->parse( \$_, 'main');
    }
}

sub dump_entry{
    print "dump_entry received: "; dd shift @_;
}

sub add_op{
    my $self = shift @_;
    print "add_op received: "; dd @_;
        $$self{operator} = join '',@_;
    return $self;
}

sub add_ip{
    my $self = shift @_;
    print "add_ip received: "; dd @_;
        $$self{type} = 'IP';
    $$self{value} = join '',@_;
       return $self;
}
sub add_hostname{
    my $self = shift @_;
    print "add_hostname received: "; dd @_;
        $$self{type} = 'hostname';
    $$self{value} = join '.',@_;
        return $self;
}
__DATA__

PARSING: add example.org
add_op received: "add"
add_hostname received: ("example", "org")
dump_entry received: { operator => "add", type => "hostname", value =>
+ "example.org" }

PARSING: add www.perl.org
add_op received: "add"
add_hostname received: ("www", "perl", "org")
dump_entry received: { operator => "add", type => "hostname", value =>
+ "www.perl.org" }

PARSING: add 42.perl.net
add_op received: "add"
add_hostname received: (42, "perl", "net")
dump_entry received: { operator => "add", type => "hostname", value =>
+ "42.perl.net" }

PARSING: add 192.0.2.1
add_op received: "add"
add_ip received: (192, ".", 0, ".", 2, ".", 1)
dump_entry received: { operator => "add", type => "IP", value => "192.
+0.2.1" }

PARSING: remove 192.0.2.2
add_op received: "remove"
add_ip received: (192, ".", 0, ".", 2, ".", 2)
dump_entry received: { operator => "remove", type => "IP", value => "1
+92.0.2.2" }
[download]

There are no rules, there are no thumbs..
Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

Comment on Re: Cannot get Marpa::R2 to prioritise one rule over another Select or Download Code

Replies are listed 'Best First'.
Re^2: Cannot get Marpa::R2 to prioritise one rule over another by choroba (Cardinal) on Jan 21, 2021 at 17:16 UTC
The dot is ignored because the default action (::first) is used for the DOMAIN rule. Mixing the lexer and grammar rules is not a good idea, they're very different. Using consistent capitalization for the non-terminals also helps, I usually use a different rule for the grammar and lexer ones. I usually build the grammar from the top to the bottom, i.e. from the starting symbol to the L0 rules. I start with the default action of `[name,values]` and replace it with individual actions from the bottom to the top. The result might be something like Read more... (3 kB) `map{substr$_->[0],$_->[1]\|\|0,1}[\\|\|{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^ARGV,3]`	[reply] [d/l] [select]
Re^3: Cannot get Marpa::R2 to prioritise one rule over another by Discipulus (Canon) on Jan 22, 2021 at 09:02 UTC
Hello choroba, can you be so kind to explain me better your: > Mixing the lexer and grammar rules is not a good idea, they're very different. because I'm reading Marpa-R2 vocabulary and I am not able to strictly define them. Where my code mixes them? L* There are no rules, there are no thumbs.. Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.	[reply] [d/l]
Re^4: Cannot get Marpa::R2 to prioritise one rule over another by choroba (Cardinal) on Jan 22, 2021 at 09:14 UTC
Lines 24 and 25 contain lexer rules (easily recognisable by the tilde), but line 27 contains a grammar rule again, followed by another lexer rule. But maybe that's how you like it. The more important question is whether you know what the difference between them is; separating them visually helped me to keep them separated in my head, too. `map{substr$_->[0],$_->[1]\|\|0,1}[\\|\|{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^ARGV,3]`	[reply] [d/l]
Re^3: Cannot get Marpa::R2 to prioritise one rule over another by Anonymous Monk on Jan 21, 2021 at 21:07 UTC
Thanks for demonstrating how to recompose the dotted components of hostnames and IPs, using a custom action. I had been wondering how best to go about that, and you have given me a starting point. One question, regarding your `concat` subroutine, if I may: Is it possible to generalise it to return the `[rulename,concatted-string]` pair, so it conforms to the tokens emitted by the default action `[name,values]`, or would I have to have a separate subroutine for each rule (and return the rulename literally)? I had originally thought there might be context in first argument, which you `shift` over, but that appears to be an empty hashref in all cases I've seen.	[reply] [d/l] [select]
Re^4: Cannot get Marpa::R2 to prioritise one rule over another by choroba (Cardinal) on Jan 21, 2021 at 21:14 UTC
The first argument is there for you, you can store whatever you want in it. But if you can build the result just by composition, I don't see a reason to use it. AFAIK, there aren't many predefined actions (::first, [name,values]). Concatenation is definitely not a universal thing, you typically propagate structures, not strings. `map{substr$_->[0],$_->[1]\|\|0,1}[\\|\|{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^ARGV,3]`	[reply] [d/l]
Re^5: Cannot get Marpa::R2 to prioritise one rule over another by Anonymous Monk on Jan 21, 2021 at 21:56 UTC
Re^6: Cannot get Marpa::R2 to prioritise one rule over another by choroba (Cardinal) on Jan 21, 2021 at 22:17 UTC
Re^2: Cannot get Marpa::R2 to prioritise one rule over another by Anonymous Monk on Jan 21, 2021 at 20:55 UTC
Thanks for this attempt, but I'm not sure that defining `hostname` as a fixed number of `DOMAIN` components, nor defining a limited set of `EXT` suffixes is the right way to go. Hostnames can be arbitrarily long, at least in terms of subdomains, and the list of top-level domains is growing by the day. I'm probably going to settle just capturing `NAME` and laying off the semantics of IPv4, (later) IPv6, and neither of those to a custom action. Given the complexity of the problem (esp. IPv6), that is likely the best way forward. J.	[reply] [d/l] [select]


Perl Monk, Perl Meditation
	PerlMonks