1: This was one of the first scripts I wrote. I needed to rapidly find strings in compressed IIS logfiles, and this was the result.
   2: 
   3: It handles both compressed and uncompressed logfiles in the same directory, but does not handle ZIP files with more than one file inside.
   4: 
   5: Any feedback welcomed.
   6: 
   7: #
   8: # Search.pl -- Search for arbitary string in logfiles
   9: # Designed for a ActivePerl/Win32 environment
  10: #
  11: # Author: Joshua Thomas
  12: # Last update: 10/24/2002
  13: #
  14: # 1.0: initial release
  15: # 
  16: 
  17: use Archive::Zip qw( :ERROR_CODES :CONSTANTS );
  18: use Cwd;
  19: 
  20: # Get our path and current working dir
  21: # Expect the following args, ARGV[0] = [zip|log|all] 
  22: # ARGV[1] = [path], ARGV[2] = [phrase], ARGV[3] = [outfile]
  23: 
  24: ($scope = $ARGV[0]) || &usage;
  25: ($path = $ARGV[1]) || &usage;
  26: ($phrase = $ARGV[2]) || &usage;
  27: ($outfile = $ARGV[3]) || &usage;
  28: 
  29: # Strip whitespace from args
  30: $scope =~ s/\s+//;
  31: 
  32: if ($scope != /(zip)|(log)|(all)/) { &usage; }
  33: 
  34: # Move to working directory
  35: 
  36: $cwd = getcwd();
  37: chdir $path;
  38: 
  39: # Open the master file that we write results out to
  40: 
  41: open(OUTFILE, ">$outfile");
  42: 
  43: # Loop through all the zip files
  44: 
  45: if ($scope =~ /(zip)|(all)/ ) {
  46: 
  47: 	while (defined ($file = glob("*.zip"))) { 
  48:     
  49: 	    print "$file: ";
  50: 
  51: 	    $zip = Archive::Zip->new();
  52: 	    die 'Bad zip file!' if $zip->read( $file ) != AZ_OK;
  53: 
  54: 	    # We only expect one member/file [for now]
  55: 	    @members = $zip->memberNames();
  56: 	    $extracted = $members[0] . ".tmp";
  57: 		die "could not extract $members[0]!" if $zip->extractMember($members[0], $extracted) != AZ_OK;
  58: 
  59: 		print "Extracted $members[0], ";
  60: 		
  61: 	    # Now we've got $file.log.tmp
  62: 
  63: 	    # Find string, write to file
  64: 	    open(INFILE, $extracted);
  65: 	    print "finding matches, ";
  66: 
  67: 	    while(<INFILE>){
  68: 	        if (/$phrase/) {
  69: 	        print OUTFILE "$_";
  70: 	        }
  71: 	    }
  72: 
  73: 	    close(INFILE);
  74: 
  75: 	    $result = `del $extracted`;
  76: 	
  77: 	    print" done.\n\n";
  78: 	}
  79: }
  80: 
  81: # Loop through .log files
  82: 
  83: if ($scope =~ /(log)|(all)/ ) {
  84: 
  85: 	while (defined ($file = glob("*.log"))) { 
  86:     
  87: 	    print "$file: ";
  88: 	    
  89: 	    # Don't have to extract the file here, skip right to searching
  90: 
  91: 	    open(INFILE, $file);
  92: 	    print "finding matches, ";
  93: 
  94: 	    while(<INFILE>){
  95: 	        if (/$phrase/) {
  96: 	        print OUTFILE "$_";
  97: 	        }
  98: 	    }
  99: 
 100: 	    close(INFILE);
 101: 	
 102: 	    print" done.\n\n";
 103: 	}
 104: }
 105: 
 106: # Change back to starting directory 
 107: chdir $cwd;
 108: 
 109: # Close file
 110: 
 111: close(OUTFILE);
 112: 
 113: sub usage {
 114:     print ("search.pl -- Find lines with a given phrase from a directory of logfiles.\n");
 115:     print ("useage: search.pl [zip|log|all] [path] [phrase] [outfile]\n\n");
 116:     print ("option 'zip' will strip compessed .zip archives\n");
 117:     print ("option 'log' will strip uncompressed .log files\n");
 118:     print ("option 'all' will do both .zip and .log files\n");
 119:     print ("[path] should be a full-length path surrounded by double-quotes.\n");
 120:     exit(0);
 121: 
 122: }
 123: 
 124: 
 125: /rgds,
 126: ibanix

Replies are listed 'Best First'.
Re: compressed logfile grep
by submersible_toaster (Chaplain) on Nov 22, 2002 at 23:04 UTC
    I have some suggestions, for readability and simplicity.
    With utility scripts like this, I find putting the usage at the top of the code, reminds me what this is SUPPOSED to do. &usage is cool , but you can do similarly and IMHO easier as....
    my $usage=<<EOF; usage: These are your options punk. EOF die $usage unless ($#ARGV==3);
    Also , pretty sure you could use map to replace the core searcher.
    print OUTFILE map { /$phrase/ && $_} <INFILE>;
    Which might reduce the number of operations you need for each match, rather than a while - if arrangement.
      An even simpler way here would be print OUTFILE grep /$phrase/, <INFILE>; Unfortunately replacing while with a map or grep results in slurping the file to search it, which is most likely undesired with logfiles. It can be whittled down though: /$phrase/ && print OUTFILE $_ while <INFILE>;

      Makeshifts last the longest.

Re: compressed logfile grep
by joe++ (Friar) on Nov 21, 2002 at 21:31 UTC
    What about:
    #!/bin/sh for F in `ls *zip` do zcat $F | grep "what I'm interested in" done
    I mean, sometimes Perl isn't the right tool...
    (Oh yeah, I assume Un*x here).

    --
    Cheers, Joe

      Sadly, no Unix.

      If you read the code, you'd see I intended it for a ActivePerl/Win32 environment. :-)

      <-> In general, we find that those who disparage a given operating system, language, or philosophy have never had to use it in practice. <->