Greetings,
Solution 2 does in fact refer to the idea of keeping a running count, however, the idea of "running through" sort of goes away.
I am proposing of keeping a two column (list, table, tabulation...):
Time | Count |
00:15:48 | 2 |
03:20:12 | 1 |
03:30:00 | 0 |
07:25:46 | 4 |
... | ... |
Which is a succint (I venture to speculate it may in fact be nearly optimal) representation of the stairstep time function that you are interested in. Of course you have to look at your data set in order to compute it - but that is the minimal requirement for the task at hand, short of oracular intervention (thinking Delphi - the place - not Mountain view). Also you keep your table updated by processing only the events that took place from the latest vaiation you recorded. This could easily be accomplished (on redhat linux) by hooking ino the daily logrotate job.
If this seems unelegant, the proces doing the event logging could perhaps be hacked to also increment/decrement the count as the events arrive. (Of course, in a situation where you may not have reliable termination events - like weblogs - everything beomes more complex...)
The use of a DBMS does in fact mean that you set up a data structure for your data (your #1 idea), but the critical pint is that it is a data structure that is highly optimized for searching, sorting and counting - and that is not a trivial addition. Besides, the kind of tables that other have proposed have the virtue of making many more type of questions answerable by simple SQL means. OTOH, the table outlined above is minimally tuned to answer the "how many" questions without any information loss (entailed by any periodic sampling)
And I should mention that, if I had to do it, I would probably try to set it up in MRTG, which is probably already 75% there. Of course, MRTG does use periodic sampling and reaggregates its data set on the basis of a larger time step for the weekly and monthly stats, which is a lossy process.
Cheers,
alf
You can't have everything: where would you put it? | [reply] [Watch: Dir/Any] |
Ok. I see what you're getting at now. The table was very helpful. This is a really unique solution among the ones I've seen so far. Once I get this module finished, I think I may go back and do benchmarks between my chosen solution (the one involving granularity-based marks and from those, incremental on-off logs) and this one. If I do, I'll letcha know how they come out. If these logs you've thought up were separated somehow (by day into files, or something) the seek time wouldn't be too bad at all.
My officemate suggested last night that a tree-based system might be devised that could do any search in less than N time. We didn't get it quite worked out, but I'm gonna keep at it and see what I can come up with.
--
Love justice; desire mercy.
| [reply] [Watch: Dir/Any] |