Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello everyone. I am new to Perl, but I have been working with it to understand it and eventually have a very strong grasp of it. I know the basics on how to write a perl script, but I have been given a task that will require me to write a pretty difficult perl script. Being that I am new and learning Perl, I have decided to ask for some guidance.

Before I go on, I want to say a few things: First, "please do not write the Perl script for me". This is a great opportunity for me to learn and I would like to do it on my own. I am asking for some guidance and tips to help me, not only get started, but write this script.

Alright, here is what I am attempting to do. I want to setup a Perl script to extract certain parts of information from our firewall log. Specifcially, my manager wants me to set it up so any attempts trying to get in through ftp, ping or our proxy.

Right now, the firewall logs are stored on a Windows 2000 machine. The logs themselves have a ending with .wgl, standing for watchguard log. (Watchguard is the company...)

I would like to transfer the files over to a Linux box, run a Perl script on the log at night (Through cron) to extract the information that I want. Here is a snip from the firewall log, of what I am attempting to extract from the log:

03/13/03 16:44:56 kernel Temporarily blocking host 212.241.116.21 03/13/03 16:44:57 firewalld[103] deny in eth0 48 tcp 20 117 212.241.11 +6.21 209.126.xxx.xxx 4449 80 syn (LO-Proxied-HTTP)
Before I go on, I put the firewall log onto a Linux server so I could see what type of file it is. Here is what it is:

[tuxexdo@backupstorage]$ file 192.168.1.1-2003-03-17-12-28-30.wgl 192.168.1.1-2003-03-17-12-28-30.wgl: data
So it's a data file.

First question: Is it possible, to extract the information that I want from this file through Perl?

Thanks everyone.

Tarballed

Replies are listed 'Best First'.
Re: Extracting data from a firewall log
by l2kashe (Deacon) on Apr 09, 2003 at 18:39 UTC
    This is where perl really shines :)

    Note: Im gonna next in code brackets, cause I'm lazy
    Suggestions: You are able to view the data in some way shape or form, so I would + either A) figure out how to get that data out in a stream (I.e calling +an executable to provide that data on the fly for your parser) or B) instead of pulling the binary data, convert it to text, compr +ess that and munge the compressed text Determine what types of entries are in the file. Just looking at the data at hand I see at least 2 unique type of en +tries and a whole slew of other things to help with a parser. One msg from the kernel, and one msg from firewalld. Next we notice that the kernel is "Temporarily" blocking does it also log permanent blocks? does this line correlate to an earier firewalld line? The firewalld process is stating it denied a packet.. There is all sorts of juicy bits in there.. First off, the deny. What other actions can it take? Then the interface.. what other interfaces are there? The next number is interesting as I have no idea what its correl +ated to do all denies get stamped with 48? or packets on eth0, or tcp packets, or tcp packets destined for X port? Next the type what other types are coming through?..
    Already a parser is starting to become fleshed out, with some simple tweaking it should be relatively simple to do, especially if all entries are one liners, which greatly reduces logic and the need for something along the lines of a quasi statemachine..

    Happy hacking.. :)

    /* And the Creator, against his better judgement, wrote man.c */
Re: Extracting data from a firewall log
by tarballed (Initiate) on Apr 09, 2003 at 20:53 UTC
    Ok, I seem to be struggling here. I was wondering if someone can help me out with a few lines of my file, as far as regular expressions. I seem to be really stuck at this point.

    Let me show what I am trying to pull out of the file:

    03/13/03 16:44:56 kernel Temporarily blocking host 212.241.116.21

    (This is where the inital block occurs, the firewall will then continue to block all attempts from this IP address.I would like to extract all entries in the firewall like this one.)

    03/13/03 16:44:57 firewalld103 deny in eth0 48 tcp 20 117 212.241.11.21 209.126.xxx.xxx 4449 80 syn (LO-Proxied-HTTP)

    (At this point, the firewall continues to block the attempt. I would like to extract all lines in the firewall that contain this as well...contains very useful information such as ports and packet sizes.)

    With that in mind, can I ask for someone to help me build my script? I feel like I am butting my head against a wall. I know I have much to learn, but can learn a lot from seeing the script and breaking it down to see how it functions.

    Thanks.

    Tarballed

      roughly...

      ... my %lookfor; # will look like this at the end # %lookfor = ( # 192.168.254.1 => { # first => '03/13/03 16:44:56', # last => '03/13/03 16:59:30', # count => 12, # bytes => 1234 # }, # 192.168.254.2 => { # .... # } # ); # while (<FWLOG>) { if (/(.*?) kernel Temporarily blocking host (.*)/) { $lookfor{$2}{first} = $1; } elsif (/(.*?) firewalld.*? deny in eth0 (\d+) (\w+) (\d+) (\d+) ([\d +\.]+) (\d+) (\d+) (\w+) (.*)/) { if (exists $lookfor{$6}) { $lookfor{$6}{count} += 1; $lookfor{$6}{last} = $1; $lookfor{$6}{bytes} += $7; # or whichever field is bytes # keep track of other info of interest } # else ignore the uninteresting deny lines } # else ingore lines we don't care about at all } printf "%15s %6s %18s %18s %s\n", qw/ ip count first last bytes /; foreach my $badguy (keys %lookfor) { printf "%15s %6d %18s %18s %d\n", $badguy, $lookfor{$badguy}{count}, $lookfor{$badguy}{first}, $lookfor{$badguy}{last}, $lookfor{$badguy}{bytes}; }
Re: Extracting data from a firewall log
by CountZero (Bishop) on Apr 09, 2003 at 19:32 UTC

    Here is a snip from the firewall log, of what I am attempting to extract from the log:
    03/13/03 16:44:56 kernel Temporarily blocking host 212.241.116.21 03/13/03 16:44:57 firewalld[103] deny in eth0 48 tcp 20 117 212.241.11 +6.21 209.126.xxx.xxx 4449 80 syn (LO-Proxied-HTTP)

    This must mean that you are able somehow to read the logs: let's assume that you did so by opening the wgl-log file in some sort of editor (and not in a proprietary viewer).

    There are strong chances then that the log is in (a variant of) ASCII and Perl will be able to read the log by opening a read-filehandle and inputting the log line-by-line through us of the <INPUT-FILEHANDLE> function.

    If all log-lines start with a date-and-timestamp, you can extract all data following these and put it through some regular expressions to weed out the useless entries and keep the valuable ones, which you can then either output to another file, save in a database, calculate some statistics from or --in general-- mangle beyond all recognition to your (and Perl's) heart's content.

    To parse the logfile, you might have a look at regexp-log, HTTPD-Log-Filter or Log-Detect. Even if you can't use these modules directly, they will certainly give you some good ideas on how to tackle your task!

    CountZero

    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

      Yes, that is the correct. I have sent the logs over to my Linux server via syslogd. The files are now viewable with any text editor.

      Also, I am planning on searching through the file, then outputting the results to another file and save in a database.(My list just continues to grow.)

      I have to say, this task seems very daunting to me. If I could trouble someone, could you get me started a little? Maybe give me a few hints or examples.

      I think once I have a few things under my belt, I will feel more more confident and where to proceed from that point.

      Thanks.

      tarballed

Re: Extracting data from a firewall log
by Coruscate (Sexton) on Apr 10, 2003 at 01:09 UTC

    Just to link the 2 nodes together, the script resulting from the efforts of this node is posted at node id 249479, My first Perl script.


    If the above content is missing any vital points or you feel that any of the information is misleading, incorrect or irrelevant, please feel free to downvote the post. At the same time, please reply to this node or /msg me to inform me as to what is wrong with the post, so that I may update the node to the best of my ability.

Re: Extracting data from a firewall log
by jasonk (Parson) on Apr 09, 2003 at 17:53 UTC

    It may be possible, but the file type 'data' won't tell you that, data just means 'unidentified binary content', so you need to find out what the format of that content is to determine if/how you can read it.


    We're not surrounded, we're in a target-rich environment!
      I see. So basically, I need to identify the format of the file and once I have that, I can move forward?
      Thanks. Tarballed
Re: Extracting data from a firewall log
by tarballed (Initiate) on Apr 09, 2003 at 19:25 UTC
    Alright, I have figured out the log file type. I setup the Windows server to send over the log files via syslogd. Now the files will be in a text, viewable format. That part is solved.

    Now I need to work on the script. (And what a script it will be, being it will be my first true script.)

    In answer to your questions:

    Yes, I would like to extract the data like this one:
    03/13/03 16:44:56 kernel Temporarily blocking host 212.241.116.21

    03/13/03 16:44:57 firewalld103 deny in eth0 48 tcp 20 117 212.241.11 +6.21 209.126.xxx.xxx 4449 80 syn (LO-Proxied-HTTP)

    Right now, deny, allow and log are current acceptions from the log. For interfaces, we have eth0 and eth1. Most packets will be coming in through eth0.(trying to at least)

    Other parts of the log are basically bits and pieces of information about the packet; size, source and destination port. What I want to pull out is basically, the IP address that was attemtping access (In this case, 117.212.241.11) I get a lot of these during the day.

    I was thinking that I could do a search some how looking for "Temporarily blocking Host." Then, grab that information and the IP as well. Grabbing the destination port would be nice as well. That is the very last number before the word syn.

    That help?

    Tarballed

      To get the destination port out of a line which also contains "firewall" and "deny", you could use the following regular expression:

      m/firewall.+deny.+ (\d+) syn/
      Translated it means: match "firewall" followed by some characters, followed by "deny" followed by some more characters, followed by a space, some digits, another space and "syn"; also save the digits you found in the special variable $1.

      CountZero

      "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

Re: Extracting data from a firewall log
by tarballed (Initiate) on Apr 09, 2003 at 20:25 UTC
    Thanks again for your help. I am reading those links right now. I understand basically, how to open and close the file, what I am trying to figure out at this point is how to extract the data from the file, then send that data from that file to a new file.

    Here is my little script so far:

    #!/usr/bin/perl
    #This script will be run against our Firewall logs. It will extract the information we want and send it to a new log.
    #Open the firewall script to allow Perl to read it and gather data
    open(FWLOG,"firewall.txt");
    while(<FWLOG>)
    #Close the file
    close FWLOG;
    It is not much, but I am trying to grasp the basic and understand it all.

    Tarballed

Re: Extracting data from a firewall log
by tarballed (Initiate) on Apr 09, 2003 at 17:54 UTC
    Hmm, I thought I was logged in, but I am not. But, just wanted to say this is Tarballed. :)