in reply to Using binary search to get the last 15 minutes of httpd access log
I'm a fan of the Binary Search (my implementation also borrows heavily from Mastering Algorithms with Perl), but I'm not sure it's the best tool for this job. You're looking for everything from 15 minutes back, through the present (end of the file). File::ReadBackwards seems like a better algorithm. On the one hand it's a linear search from the end of file to the 15 minute mark. On the other hand, even if you're using a binary search to find that 15 minute mark, you still have the linear operation of retrieving everything from the EOF to that 15 minute mark for processing.
May as well just read the file backwards, and stop when you find the record that is more than 15 minutes old. That way instead of a O(log n) plus an O(n) operation (the search, and the retrieve), you have a single O(n) operation (the retrieve, stopping when you get to the mark). But more importantly, you eliminate a ton of complexity.
Quoting from Wikipedia:
When Jon Bentley assigned it as a problem in a course for professional programmers, he found that an astounding ninety percent failed to code a binary search correctly after several hours of working on it, and another study shows that accurate code for it is only found in five out of twenty textbooks. Furthermore, Bentley's own implementation of binary search, published in his 1986 book Programming Pearls, contains an error that remained undetected for over twenty years.
Dave
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Using binary search to get the last 15 minutes of httpd access log
by mcdave (Beadle) on Aug 03, 2012 at 21:19 UTC | |
by davido (Cardinal) on Aug 03, 2012 at 23:21 UTC | |
by mhearse (Chaplain) on Aug 03, 2012 at 21:57 UTC | |
by mhearse (Chaplain) on Oct 09, 2012 at 23:14 UTC | |
|
Re^2: Using binary search to get the last 15 minutes of httpd access log
by mhearse (Chaplain) on Aug 03, 2012 at 22:01 UTC | |
by clueless newbie (Curate) on Aug 03, 2012 at 22:37 UTC |