Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Another Array Problem: comparing.

by dru145 (Friar)
on Dec 19, 2001 at 01:36 UTC ( [id://132940]=perlquestion: print w/replies, xml ) Need Help??

dru145 has asked for the wisdom of the Perl Monks concerning the following question:

Monks,

Ok, I'm stuck. I have a log file in this format:
1;30Nov2001;17:08:25;192.168.1.2;log;accept;;hme0;outbound;udp;192.168 +.86.6;20.248.36.99;domain-udp;1103;63;85;;;;;;;;;;;;;;; 2;30Nov2001;17:08:25;192.168.1.2;log;drop;;hme0;inbound;tcp;63.28.96.2 +54;192.168.11.67;netbios-ssn;18803;48;89;;;;;;;;;;;;;;; 3;30Nov2001;17:08:26;192.168.1.2;log;drop;;hme0;inbound;tcp;65.93.20.2 +23;192.168.26.139;auth;1323;60;89;;;;;;;;;;;;;;; 4;30Nov2001;17:08:26;192.168.1.2;log;drop;;hme0;inbound;tcp;65.93.22.2 +23;192.168.26.139;auth;1323;60;89;;;;;;;;;;;;;;; 5;30Nov2001;17:08:26;192.168.1.2;log;accept;;qfe2;inbound;tcp;192.168. +86.146;20.248.36.97;http;4719;44;85;;;;;;;;;;;;;;; 6;30Nov2001;17:08:26;192.168.1.2;log;accept;;hme0;outbound;tcp;192.168 +.86.146;204.48.36.97;http;4719;44;85;;;;;;;;;;;;;;; 7;30Nov2001;17:08:26;192.168.1.2;log;accept;;qfe2;inbound;tcp;192.168. +86.146;204.48.36.97;http;4721;44;85;;;;;;;;;;;;;;; 8;30Nov2001;17:08:26;192.168.1.2;log;accept;;hme0;outbound;tcp;192.168 +.86.146;24.248.36.97;http;4721;44;85;;;;;;;;;;;;;;; 8;30Nov2001;17:08:26;192.168.1.2;log;accept;;hme0;outbound;tcp;192.168 +.86.146;20.248.36.97;http;4721;44;85;;;;;;;;;;;;;;; 9;30Nov2001;17:08:26;192.168.1.2;log;accept;;qfe2;inbound;tcp;192.168. +27.154;205.18.145.185;http;4396;44;85;;;;;;;;;;;;;;; 10;30Nov2001;17:08:26;192.168.1.2;log;accept;;hme0;outbound;tcp;192.16 +8.27.154;25.188.145.185;http;4396;44;85;;;;;;;;;;;;;;; 11;30Nov2001;17:08:26;192.168.1.2;log;accept;;qfe2;inbound;tcp;192.168 +.27.154;205.88.145.185;http;4397;44;85;;;;;;;;;;;;;;; 12;30Nov2001;17:08:26;192.168.1.2;log;accept;;hme0;outbound;tcp;192.16 +8.27.154;205.188.45.185;http;4397;44;85;;;;;;;;;;;;;;;
And here is the code I have written so far:
#!/usr/bin/perl -w use strict; my $log = './log'; my @data; # Open the firewall log file and create new array containing all of th +e data. open (LOG, $log) or die "Can't open $log: $!"; while (<LOG>){ push (@data, "$_"); } # Split the @data array into separate arrays by category. my (@dst, @service); foreach (@data) { my @lines=split "\n",$_; foreach(@lines){ my ($num,$date,$time,$fw,$type,$action,$alert,$int,$dir,$proto,$sr +c,$dst,$service,$sport,$len,$rule) = (split /;/,$_); push(@dst, $dst); push(@service, $service); } }
This seems to work fine, but what I need to do now is compare the @dst and @service arrays and if the @dst array has the same ip AND the @service array has the same service for at least 50 log entries, then I want to execute a sub, but I can't think of how to do this.

Any suggestions?

TIA

-Dru

Edit kudra, 2001-12-22 Appended to title

Replies are listed 'Best First'.
Re: Another Array Problem.
by TomK32 (Monk) on Dec 19, 2001 at 01:57 UTC
    I've got not a clean solution but it works:

    replace

    push(@dst, $dst); push(@service, $service);
    with
    @data{$dst}{$service}++; if ($data{$dst}{$service} == 50) { do magic; }

    -- package Lizard::King; sub can { do { 'anything'} };
(Ovid) Re: Another Array Problem.
by Ovid (Cardinal) on Dec 19, 2001 at 02:01 UTC

    This isn't tested, but I cleaned up your code and added a couple of things to do what I think you need:

    #!/usr/bin/perl -w use strict; my $log = './log'; # Open the firewall log file and create new array containing all of th +e data. open LOG, "<", $log or die "Can't open $log: $!"; my @data = <LOG>; close LOG; # Split the @data array into separate arrays by category. my (@dst, @service); my $dup_count = 0; my $last_dst = ''; my $last_service = ''; foreach (@data) { my @lines=split "\n",$_; foreach(@lines){ my ($dst,$service) = (split /;/,$_)[11,12]; push(@dst, $dst); push(@service, $service); if ( $dst eq $last_dst and $service eq $last_service ) { $dup_count++; # you probably want to clean up dup_count here to avoid func # being called for dup 51, dup 52, etc &some_func if $dup_count >= 50; } else { # didn't match, so we reset; $dup_count = 0; } $last_dst = $dst; $last_service = $service; } }

    Cheers,
    Ovid

    Update: Okay, I think I am confused about the specification. I thought we were looking for 50 identical dsts and services in a row. Reading the question closer, it appears to be a has related issue, in which case, TomK32 gave a good answer.

    Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

Re: Another Array Problem.
by aijin (Monk) on Dec 19, 2001 at 02:03 UTC
    There are a couple of things that aren't clear to me here.

    First, why read the file into an array and then go through the array and split the lines. Are you sure any of the lines are splitting? When you while(<LOG>) you are reading the file line by line, which makes the split you've got in the foreach loop unnecessary.

    Secondly, it's not clear what exactly you're checking, so I'm going to make the following assumptions. Please correct me if I'm wrong.

    1. You want to check if all the entries in the @dst array are the same IP.

    2. You want to know when there are 50+ of any service in the @service array.

    Both of these tasks can be solved with the use of a hash. I suggest you meander over to the Categorized Questions and Answers section and read up on them.

    -a.

      aijin,

      You are correct, I didn't need the first array (@data) nor did I need to split on a new line. Thanks for the hash tip. I figured that's what I needed, but I don't have much experience with hashes, so I took this time to learn. I think I'm finally grasping them. Here is the code I came up with (suggestions appreciated):
      #!/usr/bin/perl -w use strict; + my $log = './log'; my (%count, %hash); open (LOG, $log) or die "Can't open $log: $!"; while (<LOG>){ foreach($_){ my ($num,$date,$time,$fw,$type,$action,$alert,$int,$dir,$proto,$src +,$dst,$service,$sport,$len,$rule) = (split /;/,$_); %hash = (dest => $dst, service => $service); foreach my $key (keys %hash){ my $val = $hash{$key}; $count{$val}++; } #close foreach } #close foreach }#close while foreach my $key1 (keys %count){ print "$key1 appears $count{$key1} times\n"; } #close foreach
      I still need something that will run a sub if both the destination ip AND service appears AT LEAST 50 times in the log files, but I think this will be fairly easy.

      Thanks again,
      Dru
        Suggestions welcome? Here they come :)

        foreach($_){
        foreach($_) is kind of useless, you can safely remove it (and its closing bracket, of course).

        my ($num,$date,$time,$fw,$type,$action,$alert,$int,$dir,$proto,$src +,$dst,$service,$sport,$len,$rule) = (split /;/,$_);
        You don't have to name everything. Instead, you can assign to undef if you don't need a specific value.
        my (undef, undef, undef, undef, undef, undef, undef, undef, undef, +undef, undef, $dst, $service, undef, undef, undef) = split /;/; # spl +it() works on $_ if only one argument is given.
        Because there are more undefs than used values, a list slice would be even better:
        my ($dst, $service) = (split /;/)[11, 12];

        %hash = (dest => $dst, service => $service); foreach my $key (keys %hash){ my $val = $hash{$key}; $count{$val}++; } #close foreach } #close while
        There's no need to use these temporary variables %hash and $val;
        Well indented code doesn't need "#close foreach" comments (unless it's a huge sub, but in that case the design was probably wrong anyway).
        Because only the values of the hash are used and they're set within the same scope, there's no need for the hash at all.
        I'll also use the for-modifier (for equals foreach, but is shorter) to demonstrate perl's nice syntactic features.
        $count{$_}++ for $dst, $service; }

        foreach my $key1 (keys %count){ print "$key1 appears $count{$key1} times\n"; } #close foreach
        This can be done using map, but it might be confusing if you don't know how it works:
        print map "$_ appears $count{$_} times\n", keys %count;

        Please also note I have a whitespace after every comma, which in my opinion makes the source more readable.
        I hope this was useful to you

        As a whole:

        #!/usr/bin/perl -w use strict; + my $log = './log'; my %count; open (LOG, $log) or die "Can't open $log: $!"; while (<LOG>){ my ($dst, $service) = (split /;/)[11, 12]; $count{$_}++ for $dst, $service; # Now I see it this way, I realise that # $count{$_}++ for (split /;/)[11, 12]; # would be even better :) } print map "$_ appears $count{$_} times\n", keys %count;

        2;0 juerd@ouranos:~$ perl -e'undef christmas' Segmentation fault 2;139 juerd@ouranos:~$

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://132940]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (6)
As of 2024-04-25 08:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found