Re: Regex to match a Cisco ACL
by Corion (Patriarch) on May 22, 2011 at 09:55 UTC
|
A simple and easy to debug approach is to build your regular expression from other regular expressions that you have tested before. For example:
my $re_protocols = qr/ip|tcp|udp|object-group\s+(\w+)/;
my $re_port = qr/eq\s+(\d+)|range\s+(\d+)\s+(\d+)|/;
my $acl = qr/access-list
\s+
(\w+) # name
\s+
(\w+) # action
\s+
($re_protocols)
\s+
# ... and so on
/x;
| [reply] [d/l] |
|
|
Thanks! That is a pretty easy approach indeed. But the source & destination network are still a problem though.
If I find a "any" or "host X.X.X.X" entry in the line, how to help the interpreter to determine if it is source or destination. Actually the same limitation exists for the regex $re_protocols as well. It wouldn't know the difference between object-group entry for protocol, source network or a destination network.
| [reply] |
Re: Regex to match a Cisco ACL
by CountZero (Bishop) on May 22, 2011 at 09:54 UTC
|
It would be easier to tackle this if you could give us some actual lines to work with, rather than the description. If possible some lines that show the various possibilities.
CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James
| [reply] |
|
|
I totally agree with you, you are right, the descriptions are not too helpful. OK, Here are some sample lines:
access-list V420_IN extended permit object-group Symantec_Service_Grou
+p object-group Symantec_Clients Symantec_Servers (all 3 object groups
+)
access-list V420_IN extended permit object-group Symantec_Service_Grou
+p 10.148.0.0 255.254.0.0 host 10.149.16.40 (One service group and tw
+o network addresses)
access-list V420_IN extended permit object-group Symantec_Service_Grou
+p any any (Source & Destination any)
access-list V420_IN extended permit tcp any any range 137 139 (with a
+range of TCP ports)
access-list V420_IN extended permit tcp any any eq 445 (with a single
+service port)
Of course there are many more, but they are all permutations of different possibilities for the three sections I mentioned above. Hope it makes more sense now | [reply] [d/l] |
Re: Regex to match a Cisco ACL
by GrandFather (Saint) on May 22, 2011 at 11:27 UTC
|
| [reply] |
|
|
Thanks, I had earlier skimmed through the documentation of these modules, but found that they are a bit over-kill for the task. In any case, I am trying to learn to code it on my own. I already have a working code that just does what I want, but I am just curious to explore better ways.
| [reply] |
Re: Regex to match a Cisco ACL
by CountZero (Bishop) on May 22, 2011 at 15:11 UTC
|
What do you think of this? use Modern::Perl;
use Regexp::Common;
my $cisco_protocol = qr {
(?:ip|tcp|udp)
|
(?:object-group\s+[\S]+)
}x;
my $cisco_network = qr{
(?:host\s+[\S]+)
|
(?:$RE{net}{IPv4}\s+$RE{net}{IPv4})
|
(?:object-group\s+[\S]+)
|
any
}x;
my $cisco_ports = qr{
(?:eq\s+\d+)
|
(?:range\s+\d+\s+\d+)
}x;
my $cisco_regex = qr{^
access-list
\s+
(?<name>[\S]+) # name
\s+
extended
\s+
(?<action>(?:permit|deny)) # action
\s+
(?<proto>$cisco_protocol) # protocol
\s+
(?<source>$cisco_network) # source_network
\s+
(?<destination>$cisco_network) # destination_network
(?:\s+(?<ports>$cisco_ports))? # ports
}x;
while ( my $rule = <DATA> ) {
chomp $rule;
say "Parsing >$rule<";
if ( $rule =~ m/$cisco_regex/ ) {
say "Name: $+{name}";
say "Action: $+{action}";
say "Protocol: $+{proto}";
say "Source: $+{source}";
say "Destination: $+{destination}";
say "Ports: $+{ports}" if defined $+{ports};
print "\n";
}
else { say "No match\n"; }
}
__DATA__
access-list V420_IN extended permit object-group Symantec_Service_Grou
+p object-group Symantec_Clients Symantec_Servers
access-list V420_IN extended permit object-group Symantec_Service_Grou
+p 10.148.0.0 255.254.0.0 host 10.149.16.40
access-list V420_IN extended permit object-group Symantec_Service_Grou
+p any any
access-list V420_IN extended permit tcp any any range 137 139
access-list V420_IN extended permit tcp any any eq 445
Output: Parsing >access-list V420_IN extended permit object-group Symantec_Ser
+vice_Group object-group Symantec_Clients Symantec_Servers<
No match
Parsing >access-list V420_IN extended permit object-group Symantec_Ser
+vice_Group 10.148.0.0 255.254.0.0 host 10.149.16.40<
Name: V420_IN
Action: permit
Protocol: object-group Symantec_Service_Group
Source: 10.148.0.0 255.254.0.0
Destination: host 10.149.16.40
Parsing >access-list V420_IN extended permit object-group Symantec_Ser
+vice_Group any any<
Name: V420_IN
Action: permit
Protocol: object-group Symantec_Service_Group
Source: any
Destination: any
Parsing >access-list V420_IN extended permit tcp any any range 137 139
+<
Name: V420_IN
Action: permit
Protocol: tcp
Source: any
Destination: any
Ports: range 137 139
Parsing >access-list V420_IN extended permit tcp any any eq 445<
Name: V420_IN
Action: permit
Protocol: tcp
Source: any
Destination: any
Ports: eq 445
As you see the first one does not match as I think the rule is not well formed. It seems to miss an object-group token.
CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James
| [reply] [d/l] [select] |
|
|
CountZero, it is Spot on!! I ran it on a config file containing around 900 rules. It matched most of it, except few cases which I didn't outline above. I still have to modify it to parse remarks and match them with the appropriate rules. Well that is for my homework...
My previous code used to take almost a minute to parse. This one produced the result instantaneously. Proves to myself that I wasn't wrong in approaching the monks :)
| [reply] |
Re: Regex to match a Cisco ACL
by JavaFan (Canon) on May 22, 2011 at 13:29 UTC
|
You're just giving fragments of a grammar. For instance, what's
object-group <protocol object-group name> (2 word)
mean? What are the rules here, and what the tokens? And what does the "2 word" mean?
With the stuff Yves did in 2006/2007 for the regex engine, you can write patterns that match anything that can be given in a BNF, but I cannot make out from your post what it is you want to parse.
Note however that if you really want to parse, single regexes aren't so suitable. Patterns are great for extracting and validation, but are rather kludgy for parsing. Because typically when you parse, you want to know the structure of they thing you parsed, and act accordingly. Regexes typically tell you they matched "abc", but won't tell you whether how that "abc" is composed (that is, if you'd have a regexp to parse English sentences, it will tell you that "John eats an apple" is a proper sentence, but it won't tell you it's composed as subject-verb-object, with the subject a proper name, the verb third person singular (present tense), and the object being "definite article-noun"). | [reply] [d/l] |
|
|
Yes, I didn't try to post a rigid grammar since I assumed the expression was simple and self-explanatory. But there is no substitute for unambiguous representation. I came to know about BNF for the first time through your post, thanks a lot! Here is my attempt to describe Cisco's ACL grammar using BNF. Hope my grammar is right!
<acl> ::= "access-list" <interface-name> <action> <protocol> <sou
+rce> <destination> <port>
<action> ::= "permit" | "deny"
<protocol> ::= "tcp" | "udp" | "ip" | "object-group" <object-group-
+name>
<source> ::= "object-group" <object-group-name> | "host" <host-ad
+dress> | <network-address> <net-mask>
<destination> ::= "object-group" <object-group-name> | "host" <host-ad
+dress> | <network-address> <net-mask>
<port> ::= "eq" <port-number> | "range" <low-port> <high-port>
+| ""
The strings that I want to capture in my code are <action>, <protocol>, <source>, <destination>, <port>. Sorry for my previous ambiguous post.
If regexes are not the best way to decode this, how else can I do? I basically want to learn a solid way to deal with the config files, instead of using nested if's or tracking using a dodgy index variable. | [reply] [d/l] |
|
|
Yes, I didn't try to post a rigid grammar since I assumed the expression was simple and self-explanatory. But there is no substitute for unambiguous representation. I came to know about BNF for the first time through your post, thanks a lot!
Here is my attempt to describe Cisco's ACL grammar using BNF. Hope my grammar is right!
It isn’t, because you’re missing a bunch of productions.
If regexes are not the best way to decode this, how else can I do? I basically want to learn a solid way to deal with the config files, instead of using nested if's or tracking using a dodgy index variable.
Oh, regexes are a good approach. You just have to make them grammatical is all. You’ll need to be running v5.10 for that.
Here’s an example of converting your correcting BNF into a grammatical regex that at least compiles. I don’t know whether it’s right because I have no set of sample inputs from which to construct a test suite. This first version doesn’t do any capturing, but the second one I give further on down below does.
use v5.10;
my $acl_rx = qr{
(?&acl) # match one of these
# according to these "regex sub" definitions:
(?(DEFINE)
(?<acl> access_list (?&interface_name) (?&action) (?&pr
+otocol) (?&source) (?&destination) (?&port) )
(?<action> permit | deny )
(?<protocol> tcp | udp | ip | object-group (?&object_group_n
+ame) )
(?<source> object-group (?&object_group_name) | host (?&ho
+st_address) | (?&network_address) (?&net_mask) )
(?<destination> object-group (?&object_group_name) | host (?&ho
+st_address) | (?&network_address) (?&net_mask) )
(?<port> eq (?&port_number) | range (?&low_port) (?&high
+_port) | )
(?<interface_name> (?&chunk) )
(?<object_group_name> (?&chunk) )
(?<host_address> (?&chunk) )
(?<network_address> (?&chunk) )
(?<net_mask> (?&chunk) )
(?<port_number> (?&chunk) )
(?<low_port> (?&chunk) )
(?<high_port> (?&chunk) )
(?<chunk> (?&ws) \S+ (?&ws) )
(?<ws> \s* )
)
}x;
However, I much prefer this version, which uses Damian’s Regexp::Grammars module:
use Data::Dump;
my $acl_grammar = do {
use Regexp::Grammars;
qr{
# In case you need it, uncomment this line:
# <debug:on>
# Match this...
<acl>
# According to these definitions:
<rule: acl> access-list <interface_name> extended <ac
+tion> <protocol> <source> <destination> <port> <comment>
<rule: action> permit | deny
<rule: protocol> tcp | udp | ip | object-group <object_gro
+up_name>
<rule: object_group> object-group <object_group_name>
<rule: source> object-group <object_group_name> | host <
+host_address> | <network_address> <net_mask>
<rule: destination> object-group <object_group_name> | host <
+host_address> | <network_address> <net_mask>
<rule: port> eq <port_number> | range <low_port> <high
+_port> |
<rule: interface_name> <name>
<rule: object_group_name> <chunk>
<rule: host_address> <address>
<rule: network_address> <address>
<rule: net_mask> <address>
<rule: port_number> <portno>
<rule: low_port> <portno>
<rule: high_port> <portno>
<rule: address> any | <dotted_quad> | <name>
<rule: portno> \d+
<token: dotted_quad> <.octet> ( <.dot> <.octet> ){3}
<rule: comment> \( [^()]* \)
<token: chunk> \w+
<token: octet> \d{0,3}
<token: dot> \.
<token: name> <.capword> ** _
<token: capword> \p{Lu} \p{Alnum}+
}x;
};
while (my $input = <DATA>) {
if ($input =~ $acl_grammar) {
say "MATCHED";
dd \%/; # parse tree of a successful match
# appears in the %/ variable
} else {
warn "CAN'T MATCH: $input";
}
}
__END__
access-list V420_IN extended permit object-group Symantec_Service_Grou
+p object-group Symantec_Clients Symantec_Servers (all 3 object groups
+)
access-list V421_IN extended permit object-group Symantec_Service_Grou
+p 10.148.0.0 255.254.0.0 host 10.149.16.40 (One service group and tw
+o network addresses)
access-list V422_IN extended permit object-group Symantec_Service_Grou
+p any any (Source & Destination any)
access-list V423_IN extended permit tcp any any range 137 139 (with a
+range of TCP ports)
access-list V424_IN extended permit tcp any any eq 445 (with a single
+service port)
Isn’t that splendid?
I’ve had to correct and update your grammar, but it still matches only the first two records. I’ll leave the rest as an exercise for the reader. :) Here is the output it produces:
MATCHED
{
"" => "access-list V420_IN extended permit object-group Symantec_Ser
+vice_Group object-group Symantec_Clients Symantec_Servers (all 3 obje
+ct groups)",
"acl" => {
"" => "access-list V420_IN extended permit object-group Symantec_S
+ervice_Group object-group Symantec_Clients Symantec_Servers (all 3 ob
+ject groups)",
"action" => "permit",
"comment" => "(all 3 object groups)",
"destination" => {
"" => "Clients Symantec_Servers",
"net_mask" => {
"" => "Symantec_Servers",
"address" => { "" => "Symantec_Servers", "name" => "Symantec_S
+ervers" },
},
"network_address" => {
"" => "Clients",
"address" => { "" => "Clients", "name" => "Clients" },
},
},
"interface_name" => { "" => "V420_IN", "name" => "V420_IN" },
"port" => "",
"protocol" => {
"" => "object-group Symantec_Service_Group",
"object_group_name" => { "" => "Symantec_Service_Group", "chunk"
+ => "Symantec_Service_Group" },
},
"source" => {
"" => "object-group Symantec_",
"object_group_name" => { "" => "Symantec_", "chunk" => "Symantec
+_" },
},
},
}
MATCHED
{
"" => "access-list V421_IN extended permit object-group Symantec_Ser
+vice_Group 10.148.0.0 255.254.0.0 host 10.149.16.40 (One service gro
+up and two network addresses)",
"acl" => {
"" => "access-list V421_IN extended permit object-group Symantec_S
+ervice_Group 10.148.0.0 255.254.0.0 host 10.149.16.40 (One service g
+roup and two network addresses)",
"action" => "permit",
"comment" => "(One service group and two network addresses)",
"destination" => {
"" => "host 10.149.16.40",
"host_address" => {
"" => "10.149.16.40",
"address" => { "" => "10.149.16.40", "dotted_quad" => "10.149.
+16.40" },
},
},
"interface_name" => { "" => "V421_IN", "name" => "V421_IN" },
"port" => "",
"protocol" => {
"" => "object-group Symantec_Service_Group",
"object_group_name" => { "" => "Symantec_Service_Group", "chunk"
+ => "Symantec_Service_Group" },
},
"source" => {
"" => "10.148.0.0 255.254.0.0",
"net_mask" => {
"" => "255.254.0.0",
"address" => { "" => "255.254.0.0", "dotted_quad" => "255.254.
+0.0" },
},
"network_address" => {
"" => "10.148.0.0",
"address" => { "" => "10.148.0.0", "dotted_quad" => "10.148.0.
+0" },
},
},
},
}
CAN'T MATCH: access-list V422_IN extended permit object-group Symantec
+_Service_Group any any (Source & Destination any)
CAN'T MATCH: access-list V423_IN extended permit tcp any any range 137
+ 139 (with a range of TCP ports)
CAN'T MATCH: access-list V424_IN extended permit tcp any any eq 445 (w
+ith a single service port)
Good luck!
| [reply] [d/l] [select] |
|
|
|
|
|
|
|
|
|
|
|
Something like (untested):
my $pat = qr {
(?(DEFINE)
(?<action> (?:\s*\b(?:permit|deny)\b))
(?<protocol> (?:\s*\b(?:tcp|upd|ip|object-group(?&object_group_
+name)\b))
(?<object_group_name> (?:please define))
(?<source>) (?:\s*\b(?:object-group (?&object_group_name)|host
+ (?&host_address)|(?&network_address) (?&net_mask))\b))
(?<host_address> (?:please define))
(?<network_address> (?:please define))
(?<net_mask> (?:please define))
(?<destination>) (?&source))
(?<port> (?:\s*\b(?:port (?&port_number)|range (?&low_port) (?&
+high_port)|)\b))
(?<port_number> (?:please define))
(?<low_port> (?:please define))
(?<high_port> (?:please define))
)
acl ((?&action)) ((?&protocol)) ((?&source)) ((?&destination)) ((?&por
+t))
}x
| [reply] [d/l] |
Re: Regex to match a Cisco ACL
by Anonymous Monk on May 22, 2011 at 10:34 UTC
|
OK, here is a part of my ugly code (the better looking parts are borrowed from other proper codes). I use a pointer that travels along each line one word at a time and populates the hash. Hope someone shows me the right way to do it.
my @matches = split / /,$entry;
my $position = 3;
if ($matches[$position] eq "permit")
{
$hash{func} = $matches[$position];
$position++;
}
elsif ($matches[$position] eq "deny")
{
$hash{func} = $matches[$position];
$position++;
}
#completed action
if ($matches[$position] eq "object-group")
{
$position++;
my $protocol = get_obj_grp ($matches[$position]); #get_obj_grp
+ is a separate function read from a hash table
$hash{protocol} = join('<br>',@{$protocol->{entries}});
$position++;
}
elsif($matches[$position] =~ /ip|tcp|udp|icmp/)
{
$hash{protocol} = $matches[$position];
$position++;
}
else
{
$position++; #shouldn't reach here
}# completed protocol
if ($matches[$position] eq "object-group")
{
$position++;
my $source = get_obj_grp ($matches[$position]);
$hash{source_net} = join('<br>',@{$source->{entries}});
$position++;
}
elsif ($matches[$position] eq "any")
{
$hash{source_net} = "any";
$position++;
}
elsif ($matches[$position] eq "host")
{
$position++;
if ($matches[$position] =~ m/^\d+\.\d+\.\d+\.\d+$/) {
$hash{source_net} = "host $matches[$position]";
}
else{
$hash{source_net} = "host " . find_name($matches[$position
+]);
}
$position++;
}
elsif($matches[$position] =~ m/^\d+\.\d+\.\d+\.\d+$/)
{
$hash{source_net} = $matches[$position]." ".$matches[$position
++1];
$position++;
$position++;
}
elsif($matches[$position+1] =~ m/^\d+\.\d+\.\d+\.\d+$/)
{
$hash{source_net} = find_name($matches[$position])." ".$matche
+s[$position+1];
$position++;
$position++;
}
#completed source
if ($matches[$position] eq "object-group")
{
$position++;
my $source = get_obj_grp ($matches[$position]);
$hash{dest_net} = join('<br>',@{$source->{entries}});
$position++;
}
elsif ($matches[$position] eq "any")
{
$hash{dest_net} = "any";
$position++;
}
elsif ($matches[$position] eq "host")
{
$position++;
if ($matches[$position] =~ m/^\d+\.\d+\.\d+\.\d+$/) {
$hash{dest_net} = "host $matches[$position]";
}
else{
$hash{dest_net} = "host " . find_name($matches[$position])
+;
}
$position++;
}
elsif($matches[$position] =~ m/^\d+\.\d+\.\d+\.\d+$/)
{
$hash{dest_net} = $matches[$position]." ".$matches[$position+1
+];
$position++;
$position++;
}
elsif($matches[$position+1] =~ m/^\d+\.\d+\.\d+\.\d+$/)
{
$hash{dest_net} = find_name($matches[$position])." ".$matches[
+$position+1];
$position++;
$position++;
}
#completed destination
if ($matches[$position] eq "object-group")
{
$position++;
my $protocol = get_obj_grp ($matches[$position]);
$hash{dest_port} = join('<br>',@{$protocol->{entries}});
$position++;
}
elsif($matches[$position] eq "eq")
{
$position++;
$hash{dest_port} = $matches[$position];
}
elsif($matches[$position] eq "range")
{
$position++;
$hash{dest_port} = $matches[$position]." to ".$matches[$positi
+on+1];
$position++;
$position++;
}
else
{
$position++
}# completed port
$entry = \%hash;
}
| [reply] [d/l] |