Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello

How can I read a txt file with IP-addresses and grep these IP-addesses from several logfiles and print these result to a file?.

I know that these IP-addresses in IP.TXT exist in the logfiles.
But I want to know which logfile contain these IP-addresses.

I have a file with several IP-addresses like this.
IP.TXT
192.168.1.1
10.10.10.10

and so on

Then I have several logfiles that contain these IP-addresses like this.
FILE1_LOG
1.1.1.1
192.168.1.1
10.10.10.10

FILE2_LOG
2.2.2.2
10.10.11.11
10.10.10.10

and so on

The result I want is like this.
Resultfile.txt
IP 192.168.1.1 is in file FILE1_LOG.
IP 10.10.10.10 is in file FILE2_LOG.

The code so far
(I can read the lines from the logfiles, but dont know how to read in the IP.TXT and grep these).
use strict; my $logdir="."; opendir(LOGS, $logdir) or die ("Cant open $logdir"); my $pattern = '_log'; my @logfiles = grep /$pattern/,(readdir LOGS); closedir(LOGS); foreach my $file (sort @logfiles) { my $filename = $logdir . "/" . $file; open(LOG, $filename); my @line = <LOG>; if (my @hit = grep /I WANT TO GREP FROM IP.TXT/, @line) { chomp(@hit); print ("IP @hit is in file $file\n"); } }
Or can I do it another way?

//Anders Andersson

janitored by ybiC: <code> tags

Replies are listed 'Best First'.
Re: How can I grep these IP-addresses from these logfiles?
by atcroft (Abbot) on Oct 04, 2003 at 20:20 UTC

    You seemed very close to having it with what you displayed. You could join the chomp()ed terms from ip.txt using the pipe ('|') character, which in the expression acts as an "or" (iirc), thus making it look for any of the options inside. Adding '^(' and ')$' to it appears to prevent it matching an accidental substring (such as '10.10.10.101', when matching for '10.10.10.10').

    Below is the code I threw together that appears to meet the criteria you specified, but does not involve slurping the logfile for searching (which could be an issue with a large logfile) nor the use of grep to perform the actual search. Also, addresses are sorted in IP order, and the resulting report can be made more verbose, giving the number of occurrences in a particular file, by setting $verbose non-zero. It is perhaps not the best of code, but at least gives you a starting point to look at.

    #!/usr/bin/perl -w use strict; # Turn on short or detailed output my $verbose = 0; my $logdir = '.'; my $pattern = '_log'; # Get files to process (from original posted code) opendir( LOGS, $logdir ) or die ("Can't open $logdir: $!\n"); my @logfiles = grep( /$pattern/, readdir(LOGS) ); closedir(LOGS); # Slurp in addresses and assemble pattern: ^(aaa|bbb|ccc)$ my ($ippattern); my $ipfilelist = join ( '/', $logdir, 'ip.txt' ); open( IPLIST, $ipfilelist ) or die ("Can't open $ipfilelist: $!\n"); { my @iplist = <IPLIST>; chomp(@iplist); $ippattern = '^(' . join ( '|', @iplist ) . ')$'; } close(IPLIST); # Process files, incrementing results in $found{ip}{filename} my (%found); foreach my $file ( sort(@logfiles) ) { my $filename = join ( '/', $logdir, $file ); open( LOG, $filename ) or die ("Can't open $filename: $!\n"); while (<LOG>) { chomp; $found{$_}{$filename}++ if (m/$ippattern/); } close(LOG); } # Display results, looping through first in ip order, # then filenames in ASCII order foreach my $j ( sort( { unpack( "N", pack( "C4", split ( /\D/, $a, 4 ) ) ) <=> unpack( "N", pack( "C4", split ( /\D/, $b, 4 ) ) ) } keys(%found) ) ) { foreach my $k ( sort( keys( %{ $found{$j} } ) ) ) { print $j; if ($verbose) { print ' appeared ', $found{$j}{$k}, ' ', ( $found{$j}{$k} > 1 ? 'times' : 'time' ), ' in file '; } else { print ' is in file '; } print $k, "\n"; } } __END__ # # Contents of ip.txt, for testing 192.168.1.1 10.10.10.10 4.3.2.1 # # Contents of file1_log, for testing 1.1.1.1 4.3.2.1 10.20.30.40 192.168.1.1 10.10.10.10 # # Contents of file2_log, for testing 2.2.2.2 3.3.3.3 10.10.11.11 10.10.10.10 10.10.10.10 10.10.10.10 10.10.10.10 10.10.10.10 10.10.10.10 10.10.10.100 # # Sample of output, with $verbose = 0 4.3.2.1 is in file ./file1_log 10.10.10.10 is in file ./file1_log 10.10.10.10 is in file ./file2_log 192.168.1.1 is in file ./file1_log # # Sample of output, with $verbose = 1 4.3.2.1 appeared 1 time in file ./file1_log 10.10.10.10 appeared 1 time in file ./file1_log 10.10.10.10 appeared 6 times in file ./file2_log 192.168.1.1 appeared 1 time in file ./file1_log
Re: How can I grep these IP-addresses from these logfiles?
by tcf22 (Priest) on Oct 04, 2003 at 20:20 UTC
    #! /usr/bin/perl use strict; #BUILD IP LIST open(IPS, 'ip.txt') || die "Can't open IP.txt: $!\n"; my $iplist = join('|', map {chomp;quotemeta} <IPS>); close(IPS); my $re = qr/$iplist/; #READ IN LOG FILE my @log = map {chomp;$_} <DATA>; #Check for matches my @matches = grep /$re/, @log; print "$_\n" foreach(@matches); __DATA__ 1.1.2.1 1.3.4.5 192.168.0.1 1.2.3.4 4.5.6.7 10.0.0.1
    ip.txt
    192.168.0.1 1.1.1.1 10.0.0.1
    Output:
    192.168.0.1
    10.0.0.1

    Update: Add contents of ip.txt, which made the output make more sense.
    Combined 2 map statements into 1 when reading in ip.txt.

    - Tom

Re: How can I grep these IP-addresses from these logfiles?
by davido (Cardinal) on Oct 04, 2003 at 20:43 UTC
    I want to illustrate an iterative solution that uses map and grep. Admittedly this method puts a loop inside of a loop (rather than just one loop). Here goes.....

    my @lines = qw/marge homer bart jerry maggie george elaine kramer lisaross chandler fibi joey monica rachel batman robin catwoman batgirl superman thetick hollywood studio universal compton harbor downtown monica venice oc oxnard ventura oaks/; my @kwords = qw/bart maggie monica pete/; my @matches = map { my $kword=$_; grep /\b$kword\b/, @lines } @kwords; { local $, = "\t"; print @matches; }

    This is just an example. A little tinkering will match it to your needs. Enjoy!

    Update: I tested the above method against the regexp method in the first followup in this thread. The regexp (alternation) method proved to be faster for this small test-set of data. Even when I grew the keyword list to about 40 words, and the array of strings to find keywords in to about 40 lines of 10 "words" each, I still found alternation to be faster. I don't know what to expect if the keyword list gets REALLY big. Obviously my method puts a loop inside a loop. But alternation itself isn't considered terribly fast. Someone with a lot of test data might be able to shed light.


    Dave


    "If I had my life to do over again, I'd be a plumber." -- Albert Einstein
Re: How can I grep these IP-addresses from these logfiles?
by chanio (Priest) on Oct 04, 2003 at 21:07 UTC
    Thank you for the example, it's good!

    You have only to copy this line ...

    my $IPs="(1.1.1.1|192.168.1.1|10.10.10.10)"; #notice the '|' as a different option to compare

    And replace I WANT TO GREP FROM IP.TXT for $IPs.

    And the last $file should be $filename.

    Please, try it!

    Listen, if you are learning all this, but you don't understand it, you are loosing your precious time.

    You should undertand all these so well that you might write it in a different manner a lot of times... (if you want to do of this learning something useful, not because of anybody)

    If you need to read the IPs from a file, just imagine that it is another text file, and read it the same as any log file. Then keep the IP values.

    Use this practice to create a game, or something that you need and don't want to spend a lot of time writing. Something funny, or original.