in reply to Multi-format Log Parser

Very nice.

This script should make my life easier when working with log files from the shell. The important part is that it will make using certain commands with pipes much easier, like 'sort', 'uniq', 'cut' and 'grep'.

One nice addition would be the _option_ to build the regexp by analysing the httpd.conf file given the format name.

Another one would be to maybe eliminate the need to do the uncompression of the log file outside the script, with zcat or zgrep.

Update:
You could use Compress::Zlib for this

Replies are listed 'Best First'.
Re: Re: Multi-format Log Parser
by cjensen (Sexton) on Oct 05, 2001 at 23:40 UTC
    I typicially do something like this:

    zgrep -h '2001:13:5' *access_log* | logparse -p="%o" | sort | uniq -c | sort -r | head -30

    Get the top 30 hosts that referred traffic during the 10-minute block between 1:50 and 2:00 PM. If I know there was an issue, like a spike or a dip, I use this filter to investigate what may have caused it. Works well with tail -f too.

    I thought (briefly) about auto-building the regex by analyzing a conf file and/or the actual log, but it was quicker to write and hard-code them myself, and I don't change log file formats very often. You can do some pretty funky things with log file formats, and it doesn't seem like it would be easy to anticipate all those possibilities.

    Would it still be useful as a filter if you use Compress::Zlib? I don't want to pass more data into the script than necessary. Still thinking about the idea.
      Would it still be useful as a filter if you use Compress::Zlib?

      It would only be usefull in the cases where you zcat to your filter first, then grep on the resulting format. For example, grepping on a url could bring up referes and requests alike, so if you filter referers out first, you can then grep OK for requests. same for some other fields like status codes vs size vs IP.