Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi guys, this is a follow up question from previous thread: Re: Grep logs by start date and end date in different directories . As thanos have provided me a very useful basic script which I am very thankful for, I still have some doubts. This is the code currently (which I have modified) from thanos's script:
use strict; use warnings; use Date::Manip; use Data::Dumper; use File::Find::Rule; use IO::Uncompress::Bunzip2 (); use Net::Subnet; my @LogDir = </cygdrive/c/Users/anon/Documents/logs/png1/>; my @LogDir2 = </cygdrive/c/Users/anon/Documents/logs/png2/*/>; #Find relevant files sub get_files { my (@dirs) = @_; @ARGV = (@LogDir, @LogDir2); my @dirss = join "", map {@ARGV . $_} @dirs; my $level = shift // 3; # level to dig into my @files = File::Find::Rule->file() ->name( '*.bz2','*.log' ) #can insert regex too ->maxdepth($level) ->in(@dirss); #print @files; return @files; } #Matches IP address only sub searchForIP { my ($files, $ip) = @_; my @files = @$files; for my $file (@files){ my $filename = $file; my $fh = IO::Uncompress::Bunzip2->new($filename) or die "bunzip2 $filename: $IO::Uncompress::Bunzip2::Bunzip2Error" +; while (<$fh>){ print "$filename:$.:$_" if /$ip/; } } } #This portion contains some code for user input, I will leave this out + cause its not related to my problem my $numberOfDays = $numdays .' days'; my $dateStart = ParseDate("$sdate"); my $dateEnd = DateCalc($dateStart, $numberOfDays); # To find the every day date1 to date2 my @dates =ParseRecur("0:0:0:1:0:0:0","",$dateStart, $dateEnd); my @datesFormatted = map { UnixDate($_, '%Y-%m-%d') } @dates; my @filess = get_files(@datesFormatted); searchForIP(\@filess, $ip);
As you can see, I have multiple directories that contains many log files. As I am running the script from another directory, I would have to join the @LogDir with the @dirs in order to get the full path. An example of @dirs is 2017-12-01/access.log.bz2 so when joined, they become /cygdrive/c/Users/anon/Documents/logs/png1/2017-12-01/access.log.bz2 . I have tried using only 1 directory containing the logfiles and it worked for me, however,it doesn't work when I try putting in more than 1 directory. I am unsure how I can make it work with more than 1 directories. Below is the code when I tried it with only 1 directory (which worked for me):
#!/usr/bin/perl use strict; use warnings; use Date::Manip; use Data::Dumper; use File::Find::Rule; use IO::Uncompress::Bunzip2 (); use Net::Subnet; my $LogDir = "/opt/splunk/httpd/png1/"; #Find relevant files sub get_files { my (@dirs) = @_; my @dirss = join "", map {$LogDir . $_} @dirs; my $level = shift // 3; # level to dig into my @files = File::Find::Rule->file() ->name( '*.bz2' ) #can insert regex too ->maxdepth($level) ->in(@dirss); #print @files; return @files; }
Any response will be greatly appreciated as I have already tried many ways to insert multiple directories but it still doesn't work... I am not sure why. Thank you again

Replies are listed 'Best First'.
Re: Process multiple directories
by thanos1983 (Parson) on Jan 08, 2018 at 10:31 UTC

    Hello Anonymous Monk,

    I applied minor modifications on your code, sample bellow:

    Since we do not have your input data and samples of your directories, we only guess. Also for example what is the dates that you are checking? For example the format of the directories (dates) is "2018-01-01" or different. Just by assuming all the time we will not be able to assist you fast enough.

    Having said that, on the code above I modified the dates. I assume your code was failing to parse the dates because you had wrong format. Always see the expected input to the module and your input. I simplified it to the user input "01/01/2018" format (d/m/y). Based on this modification I assume it will match your directories.

    Also observe that your @LogDirmy and @LogDir2 are strings based on your sample of code why you need to use Arrays? I have modified the code above to a sample of different directory paths that you could use.

    I hope this sample of code resolves all your problems, alternatively update your question with more data so we can understand where the problem is coming from.

    Hope this helps, BR.

    Seeking for Perl wisdom...on the process of learning...not there...yet!
      Hi again my saviour, that have completely solved my problem. Thank you so much! :D

        Hello Anonymous Monk,

        I am glad it worked, keep up the good work experiment with your code and you will see at the end you will become very good.

        BR / Thanos

        Seeking for Perl wisdom...on the process of learning...not there...yet!
      Hi thanos1983, I am also quite interested in this topic, I am a new to programming hence my questions may be quite basic. May I ask what happens if user wants to enter multiple IP addresses instead of only 1 IP address? How can you tackle with that?

        Hello Anonymous Monk,

        The fastest and minor modification to apply is simply concatenate the IP(s) with a column. For example: my $ip = '127.0.0.1|127.0.0.2'; you can easily add as many IP(s) you want following the same pattern. There are many ways to achieve what you want but this is the easiest and less modifications needed based on the code above. See the output bellow based on testing on my local configurations. If you need further assistance let us know.

        Output using the code above:

        $ perl test.pl /home/tinyos/Monks/TestDir/2018-01-03/sys.log:1:127.0.0.1 This is insi +dent 1 in 2018-01-05 /home/tinyos/Monks/TestDir/2018-01-03/sys.log:2:127.0.0.2 This is insi +dent 2 in 2018-01-05 /home/tinyos/Monks/TestDir/2018-01-03/sys.log:4:127.0.0.1 This is seco +nd insident 4 in 2018-01-05 /home/tinyos/Monks/TestDir2/2018-01-03/sys.log:1:127.0.0.1 This is ins +ident 1 in 2018-01-05 /home/tinyos/Monks/TestDir2/2018-01-03/sys.log:2:127.0.0.2 This is ins +ident 2 in 2018-01-05 /home/tinyos/Monks/TestDir2/2018-01-03/sys.log:4:127.0.0.1 This is sec +ond insident 4 in 2018-01-05

        Hope this helps, BR.

        Seeking for Perl wisdom...on the process of learning...not there...yet!
      Hi thanos, I am the anon monk that posted the questions. Could not sign in to my account previously because most of the time I am using the company network to post the questions. If its not too much to ask for, I still have some questions in doubt.

      1. What happens if I have another directory, e.g., named TESTDIRECTORY that contains log files in this format: u_exYYMMDD.log, how can I also grep all my relevant IPs from these directories that does not have the date in the directory name? Here is the sample files in TESTDIRECTORY

      u_ex171201.log u_ex171202.log u_ex171203.log u_ex171204.log
      Note that the u_ex is a prefixed string.

      2. As the other anon monk have asked, how do I match with multiple IP addresses under this portion of the code? My sample input would be like 192.168.1.0,192.168.1.1,192.168.1.2

      sub searchForIP { my ($files, $ip) = @_; my @files = @$files; for my $file (@files){ my $filename = $file; my $fh = IO::Uncompress::Bunzip2->new($filename) or die "bunzi +p2 $filename: $IO::Uncompress::Bunzip2::Bunzip2Error"; while (<$fh>){ print "$filename:$.:$_" if /$ip/; } } }
      Any other response will be greatly appreciated and thank you so much y'all.. :)

      2018-01-09 Athanasius restored deleted content

        I have already resolved these questions. Thank you everyone for all your help! Cheers!
Re: Process multiple directories
by Anonymous Monk on Jan 08, 2018 at 08:51 UTC
    Update. One more problem to my modified script: it only works when I put 0 days, when i put 1 or more days, it doesn't print anything.
Re: Process multiple directories
by Anonymous Monk on Jan 08, 2018 at 09:33 UTC
    Update 2: Managed to made it work:
    my $LogDir = "/cygdrive/c/Users/anon/logs/png1/"; my $LogDir1 = "/cygdrive/c/Users/anon/logs/png2/apache/"; #Find relevant files sub get_files { my (@dirs) = @_; my @dirss = join "", map {$LogDir . $_} @dirs; my @dirss1 = join "", map {$LogDir1 . $_} @dirs; my $level = shift // 3; # level to dig into my @files = File::Find::Rule->file() ->name( '*.bz2','*.log' ) #can insert regex too ->maxdepth($level) ->in(@dirss,@dirss1); #print @files; return @files; }
    However, it does not work when I put >0 days. Only work when I put 0 days..... Any help is greatly appreciated, thanks a million

      You might find it simpler to process a single directory at a time. For example

      #!/usr/bin/perl use strict; use File::Find::Rule; use IO::Uncompress::Bunzip2; use Time::Piece; use Time::Seconds 'ONE_DAY'; #use Data::Dump 'pp'; # build regex use Net::Netmask; my $ip = '72.46.130.0/24'; my $block = new Net::Netmask($ip); my $re = join '|', map { quotemeta } $block->enumerate(); #print $re; my $FMT = '%Y-%m-%d'; my $dateStart = '2017-12-01'; my $numberOfDays = 5; # array ref for dates my $dates = get_dates($dateStart,$numberOfDays); # Directories to search my @LogDir = qw( /cygdrive/c/Users/anon/Documents/logs/png1/ /cygdrive/c/Users/anon/Documents/logs/png2/ ); for my $path (@LogDir){ for my $ymd (@$dates){ my $dir = $path.$ymd; next unless (-d $dir); print "In directory '$dir'\n"; my @files = File::Find::Rule->file() ->name( '*.bz2','*.log' ) ->maxdepth(3) ->in($dir); for my $filename (@files){ searchForIP($filename,$re); } } } sub searchForIP { my ($filename,$re) = @_; print "--- Searching '$filename' ---\n"; my $fh; if ($filename =~ /bz2$/){ $fh = IO::Uncompress::Bunzip2->new($filename) or die "bunzip2 $filename: $IO::Uncompress::Bunzip2::Bunzip2Error +"; } else { open $fh,'<',$filename or die "Could not open $filename : $!"; } my $count = 0; while (<$fh>){ print "$filename:$.:$_" if /$re/; ++$count; } print "\n$count lines scanned\n"; } sub get_dates { my ($start,$days) = @_; my @dates; my $t = Time::Piece->strptime($start,$FMT); for (1..$days){ push @dates,$t->ymd; $t += ONE_DAY; } return \@dates; }
      poj
      Whats with the @dirss/1 junk?