Re^2: How to quickly parse a huge web log file?

Hi superdoc,

I see a few people have replied to my question :). I guess I should have looked back over here earlier as I spent my entire weekend figuring out on my own what you guys suggest here.

The logfiles that I am dealing with are in the proper sort. I am having to look line by line, but not ness. every line. What I have ended up doing is this:

a) I get the first and last date of the logfile.
b) I check to see if the date that I seek is closer to the beginning or end of the file.
c) I start search from the beginning or the end based on what end of the file the date is closer to.
d) Once I start seeing the date seeked appear in the file I start looking for the next date. Once the next date is encountered I stop looking at the rest of the file. This usually cuts processing time by 50% or more.

To answer others questions about using grep. I have been using grep, awk and sed to do these tasks for years and they don't appear to be any faster than perl regexp.

What is this binary search that you speak of? This might help me out alot.

Thanks

Comment on Re^2: How to quickly parse a huge web log file?

Replies are listed 'Best First'.
Re^3: How to quickly parse a huge web log file? by Corion (Patriarch) on Jul 23, 2007 at 13:58 UTC
Binary Search	[reply]