Re: Binary Search Timestamps
by CountZero (Bishop) on Jul 16, 2014 at 10:05 UTC
|
As you want to generate various statistics on your log files, this means you will have to read these log files again and again and again ...Much better then to put these log files in a database and perform standard SQL queries on it. You will only have to enter each line of the log file once into the database, nicely split into its various fields. It won't be terribly difficult to automate that. Then you can have all these statistics updated just by running your SQL queries, which will be much faster than re-reading the log file again and again.
CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James My blog: Imperial Deltronics
| [reply] |
|
|
In an ideal world, this would be the method I'd prefer to use also, though unfortunately we don't have the resources to do this at the moment.
Thanks for your input however, much appreciated.
| [reply] |
|
|
This is false economy; depending on your skill level… Putting it all into SQLite is semi-trivial—Pg or MySQL only slightly harder—and would obviate the need for repetitive, selective, time consuming, and error prone reparsing.
Sometimes taking a time hit up front for extra code, structure, or learning will save 10 fold down the road.
| [reply] |
|
|
| [reply] [d/l] |
Re: Binary Search Timestamps
by Athanasius (Archbishop) on Jul 16, 2014 at 09:54 UTC
|
Hello Paraxial, and welcome to the Monastery!
Since you will need the data from all the logs up to an hour old, you don’t really need a binary search. Just read the file backwards until the latest record read is more than an hour old. I haven’t used it, but the module File::ReadBackwards is designed for just this task:
This module reads a file backwards line by line. It is simple to use, memory efficient and fast. ...
It is intended for processing log and other similar text files which typically have their newest entries appended to them.
Hope that helps,
| [reply] |
|
|
I did in-fact look at this method yesterday, and it does seem to be a good way to do this vs binary searching, especially when working with things like log files.
With that said, the File::SortedSeek module seems to be a better fit for what I'm doing it seems, though thanks for your input. As a complete newbie to Perl, it's nice to see I didn't go too far off the mark when looking for a solution to this.
| [reply] |
Re: Binary Search Timestamps
by AppleFritter (Vicar) on Jul 16, 2014 at 09:46 UTC
|
Howdy Paraxial, welcome to the Monastery!
Binary Searches on Sorted Text Files has a useful snippet of code for binary-searching text files. It's intended for sorted in its current state, but all you'd really need to do to adapt it to your needs is modify the conditions for the recursive calls. In fact, I'd pass a callback function there as an extra parameter instead of hardcoding anything specific.
One of the comments on that node also points out File::SortedSeek, which looks like it may well be useful.
| [reply] |
|
|
Thanks for this, I found the link to the page on binary searching sorted text files before I posted this, though failed to see the comment mentioning File::SortedSeek!
You're right, it does exactly what I need and I've now managed to get it working with the log file as expected, so thank you!
I've found I will probably run this script every 5-10 minutes in order to reduce the amount of RAM required to run this as it does get quite hungry.
| [reply] |
Re: Binary Search Timestamps
by locked_user sundialsvc4 (Abbot) on Jul 16, 2014 at 10:58 UTC
|
Arrange to have the file rotated, then process the files that have rotated off, putting results into a database. Also, bear in mind that there are many existing programs out there which do this job most-completely, fun though it may seem to write yet another one. There must be a hundred already. | |
Re: Binary Search Timestamps
by Anonymous Monk on Jul 16, 2014 at 20:16 UTC
|
Another option is to have a (daemon) process reading the log file (think tail -f); this could wake up say every 5 minutes, consume new lines, update counter bins, write out the statistics page or meta log-file.
| [reply] [d/l] |