symgryph has asked for the wisdom of the Perl Monks concerning the following question:

I have the following line to parse:
May 20 18:57:27 1.23.25.5 %ASA-6-106100 a6 [local4.info] access-list M +yaccess-Block permitted tcp outside/10.31.0.9(3803) -> inside/10.29.1 +0.91(4127) hit-cnt 1 300-second interval [0xa178b29d, 0x0]
I have written the following regexes to parse using memory 1-xxx for each given section:
Time: (^\S+\s\d+\s+\d+:\d+:\d+) using memory 1 Source Firewall: (\d+\.\d+\d+.\d+.\d+) using memory 1 Source part of firewall connection: access-list\s+(\S+)\s+(\S+)\s+(\S+)\s(\S+)\/(\d+\.\d+\.\d+\.\d+)\((\d+ +) 1=source acl 2=action 3=protocol 4=source interface 5=source ip 6=source port Destination part: Destination: ->\s+(\S+)/(\d+\.\d+\.\d+\.\d+)\((\d+) 1=Dest Interface 2=Destination IP 3=Destination Port \[(0x[0-9a-f]+) Matches RUle # 1=Rule#
I was wondering if my regexes are naieve and could be improved?
"Two Wheels good, Four wheels bad."

Replies are listed 'Best First'.
Re: A better way to parse this with regexes? Pix ASA logs
by rjt (Curate) on May 21, 2013 at 02:37 UTC

    Your regexen look reasonable enough to me. One way to keep your code a little cleaner when you have more than a couple of captures is with the use of the /x modifier and the new-ish named captures:

    #!/usr/bin/env perl use 5.014; use warnings; $_ = <DATA>; chomp; /^ (?<mon>\w+)\s(?<dd>\d\d) \s (?<time>..:..:..) # could capture hh:mm:ss separately if need be \s (?<src>\d+\.\d+\.\d+\.\d+) # keep going with your sub-expressions /x; say "time: $+{time} src: $+{src}"; __DATA__ May 20 18:57:27 1.23.25.5 %ASA-6-106100 a6 [local4.info] access-list M +yaccess-Block permitted tcp outside/10.31.0.9(3803) -> inside/10.29.1 +0.91(4127) hit-cnt 1 300-second interval [0xa178b29d, 0x0]

    Edit: I note you match dotted-quad IP addresses frequently. If it helps, you can throw that pattern in a variable for re-use:

    my $IP_ADDR = qr/\d+\.\d+\.\d+\.\d+/; /...(?<src>$IP_ADDR).../
Re: A better way to parse this with regexes? Pix ASA logs
by Athanasius (Archbishop) on May 21, 2013 at 02:46 UTC

    As rjt says, there’s nothing wrong with using a straight regex approach. However, in the spirit of TMTOWTDI, here’s a different approach which begins by dividing the line into word-like chunks:

    #! perl use strict; use warnings; my $string = 'May 20 18:57:27 1.23.25.5 %ASA-6-106100 a6 [local4.info] + access-list Myaccess-Block permitted tcp outside/10.31.0.9(3803) -> +inside/10.29.10.91(4127) hit-cnt 1 300-second interval [0xa178b29d, 0 +x0]'; @ARGV = split /\s+/, $string; printf "Time: %s\n", join ' ', (shift, shift, shift); printf "Source Firewall: %s\n", shift; shift for 1 .. 4; printf "Source ACL: %s\n", shift; printf "Action: %s\n", shift; printf "Protocol: %s\n", shift; shift =~ m! (.+) / (.+) \( (\d+) \) !x; printf "Source Interface: %s\n", $1; printf "Source IP: %s\n", $2; printf "Source Port: %s\n", $3; shift; shift =~ m! (.+) / (.+) \( (\d+) \) !x; printf "Destination Interface: %s\n", $1; printf "Destination IP: %s\n", $2; printf "Destination Port: %s\n", $3; shift for 1 .. 4; shift =~ m! \[ (.+) , !x; printf "Rule: %s\n", $1;

    Output:

    12:39 >perl 626_SoPW.pl Time: May 20 18:57:27 Source Firewall: 1.23.25.5 Source ACL: Myaccess-Block Action: permitted Protocol: tcp Source Interface: outside Source IP: 10.31.0.9 Source Port: 3803 Destination Interface: inside Destination IP: 10.29.10.91 Destination Port: 4127 Rule: 0xa178b29d 12:44 >

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Re: A better way to parse this with regexes? Pix ASA logs
by Anonymous Monk on May 21, 2013 at 03:13 UTC