in reply to How to do session times The Right Way

How about something like this? You want to avoid having to parse the whole log to find an instant in time, but you don't want to save all the instants in time that you could possibly need.

A compromise would be to have a process that follows the log (or parses it from the start), saving checkpoints once in a while. So say, once a day, the process says that "Right now, the following 12 people are online, and they logged on at the corresponding times: ...report...".

Then, when you want to find a specific interval, you only need to parse from the preceeding checkpoint, until the end of the interval. Depending on how frequently you checkpoint, this could be very simple.

-- Dan

  • Comment on (z) Re: How to do session times The Right Way

Replies are listed 'Best First'.
Re: (z) Re: How to do session times The Right Way
by strider corinth (Friar) on Oct 31, 2002 at 18:54 UTC
    For intervals, I'm impressed: this is a really inventive solution. I definitely think it would work well for that. The only problems I see are if I'm looking for a point in time, or if a user is logged in for a shorter period of time than the interval of the checkpoints. The point won't be accurate because logoff times are only available by the hour (we don't know if the user was logged off by, say, 1:32:56). The short time won't work because no data will be recorded on it at all.

    I think a slight modification of this system would work well for all situations. At each interval (each hour, let's say) let the system log all the users online and the time they logged on. The same record is updated with the stop times of all users who off during that hour. That way, to find out how many users were on at 1:32:56, the script only needs to look at the entry for 1:00. Any users logged on at 1:00 who hadn't yet logged off at 1:32:56 were online at that time.

    The only problem left in that scenario is of users who log on after 1:00 and off before 2:00, leaving only a logoff record in the 2:00 entry. If each entry logs any logons for that hour as well, the problem is solved with just a little more computing, and still looking only at one record.
    --

    Love justice; desire mercy.

      I think either I'm not understanding where you see the problem, or you're not seeing my solution :)

      The only problems I see are if I'm looking for a point in time, or if a user is logged in for a shorter period of time than the interval of the checkpoints.

      Why is there a problem? Say you want to know who was logged on at 13:05 (with hourly checkpoints). So your log will look something like this:

      ... 12:58:03 User 1 sign on 12:59:03 User 1 sign off 12:59:25 User 2 sign on 12:59:50 User 3 sign on 13:00:00 REPORT: online: (2-12:59:25,3-12:59:50) 13:02:23 User 4 sign on 13:03:12 User 3 sign off 13:04:23 User 4 sign off 13:04:34 User 5 sign on 13:06:52 User 5 sign off ...

      (The REPORT line is probably external to the log file, and would probably keep the offset in the file where you should start reading from)

      First, the program needs to look at the checkpoint preceeding the beginning of the interval (or the point in time, same thing) - so it knows that at 13:00, there were 2 users online. Then it starts reading the log file from after the checkpoint, keeping track of logons and logoffs, until it passes the end point of the interval (or the point in time). So when it reaches 13:06:52, it stops, and just dumps out it's data:

      Users online: 2 User 2 (since 12:59:25) User 5 (since 13:04:34)

      I think my point is, the checkpoints only keep record of historical data up to that point, so you don't have to parse the log file from the beginning of time. It just "primes" your data structure. But from the checkpoint on, you just parse the log file as usual, modifying your data as you go along.

      -- Dan

        A-ha. I see. I thought you were describing a whole new structure; I didn't realize that the old log data was still to be used.

        Ok, then. That's a great solution. =) Thanks.

        The system I derived from the solution you didn't (as it turns out =) provide looks like this at the end of the hour, for users a, b, and c:
        2:00 - Previous: a: 1:00 b: 1:35 c: 1:50 - Logons: d: 2:10 - Logouts: a: 2:30 d: 2:50 ...
        This way, none of the original data needs to be kept. To find out who was logged in at 2:15, you check the Previous and Logons fields against the Logouts field, and anything falling in between matches. It was something like that (minus a couple of nuances) that I thought you meant.
        --

        Love justice; desire mercy.