How to Iterate to Identify Concurrent Users

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: How to Iterate to Identify Concurrent Users (trees, merge) by tye (Sage) on Oct 26, 2007 at 01:36 UTC
This is exactly the type of problem that I've wanted to solve with Perl that lead me to propose that Perl add tree-based, very efficient, sorted hashes in a future version (such that entries in the hash are sorted by key). Since Perl doesn't have that, you could certainly find a module that implements hashes with sorted keys and use that (if the max number of concurrent users is not huge, then the performance should be acceptable). Then you read the "start" records in chronological order and insert them into a sorted hash with the "end" date as the key. Then you basically do a merge on the "start" records that you are reading and on the ever-changing hash of still-current users. Each time you insert a new "start" record into the hash, you first remove all of the items having an "end" date prior to the new "start" date. Track the size of the hash at each step to see how many concurrent users you have as you progress through (past) time. The other route I'd go is producing a second stream of records sorted by "end" time. You could do this by putting the records into a database with an index on the "start" time and an index on the "end" time (or just do an "order by" on the computed "end" time, if your problem size isn't too large for that to be reasonable). Or you could just produce a second log file with the "end" time up front and use an efficient external 'sort' command to put that second log in the desired order. Then you merge the two streams. Reading a record from the "start" stream adds an item to your hash of "concurrent users" (keyed by user ID this time). Reading a record from the "end" stream deletes an item from your hash. If you have the added complexity of perhaps multiple transactions per user, then you'd add a second hash recording the number of concurrent transactions for each user and delete each user when their count goes to zero. - tye	[reply]
Re^2: How to Iterate to Identify Concurrent Users (trees, merge) by Limbic~Region (Chancellor) on Oct 26, 2007 at 14:14 UTC
tye, Since Perl doesn't have that, you could certainly find a module that implements hashes with sorted keys and use that... I wrote Tie::Hash::Sorted as a result of someone at this site wanting a sorted hash. It was written with every effort at optimizations a monk with a little over a year of perl under his belt could muster. Cheers - L~R	[reply]
Re: How to Iterate to Identify Concurrent Users (SQL) by fenLisesi (Priest) on Oct 26, 2007 at 08:50 UTC
You could use SQL, as tye hinted. Suppose you load your data into a table named `sesn` (session) with fields `sesn_id, usr_id, sesn_bgn, sesn_end` and whatever else you may need, properly indexed etc. Then, a query like the following should do what you want (untested): `SELECT foo.usr_id AS user, foo.sesn_id AS session1, bar.sesn_id AS session2, foo.sesn_bgn AS start1, bar.sesn_bgn AS start2, foo.sesn_end AS end1, bar.sesn_end AS end2 FROM sesn foo, sesn bar WHERE foo.usr_id = bar.usr_id AND bar.sesn_bgn >= foo.sesn_bgn AND bar.sesn_bgn <= foo.sesn_end AND foo.sesn_id != bar.sesn_id -- AND (conditions on min duration etc here)` [download] Here I have assumed that you are trying to track multiple simultaneous sessions by the same user. If that's not what you meant, adjust accordingly. Hope this helps. Cheers.	[reply] [d/l] [select]
Re^2: How to Iterate to Identify Concurrent Users (SQL) by Anonymous Monk on Oct 26, 2007 at 13:32 UTC
Thanks tye and fenLisesi. I initially started with a database model but found it to be more cumbersome. After your good advice, I created a view and ran fenLisesi query with some modifications. It did the trick. Its slow though but it gets me what I need. I'll benchmark it with Perl code to see which is more efficient. You monks rock!!! `create view foo as select session_id, start_session, end_session, user +_id from bar;` [download] </code> Then I ran `SELECT DISTINCT foo.user_id AS user, foo.session_id AS session1, bar.session_id AS session2, foo.start_session AS start1, bar_.start_session AS start2, foo.end_session AS end1, bar_.end_session AS end2 FROM foo,bar WHERE foo.user_id != bar_.user_id AND bar_.start_session >= foo.start_session AND bar_.start_session <= foo.end_session AND foo.session_id != bar_.session_id order by start1 ASC;` [download]	[reply] [d/l] [select]
Re: How to Iterate to Identify Concurrent Users by GrandFather (Saint) on Oct 25, 2007 at 23:58 UTC
What do you want to output? A single number indicating the greatest number of concurrent users? A list indicating the number of concurrent users whenever that number changes? A list of hourly (or some other period) reports indicating the maximum number of concurrent users in that period? Something else? If you have a list of connect/disconnect times then you can sort the list by event, then iterate through the list incrementing or decrementing a connected count as appropriate for each event. Perl is environmentally friendly - it saves trees	[reply]
Re^2: How to Iterate to Identify Concurrent Users by Anonymous Monk on Oct 26, 2007 at 00:21 UTC
Quite simply, I would like to output the total number of concurrent users. The list is sorted since it is chronologically based. Thanks Grandfather.	[reply]
Re^3: How to Iterate to Identify Concurrent Users by merlyn (Sage) on Oct 26, 2007 at 00:34 UTC
"The total number of concurrent users" doesn't define much. If you mean "the number of users ever", then say that. If you mean "the number of users that had at least one other user logged in at the same time", that might make sense. For example, if one user logs in at 5, and leaves at 8, and another user logs in at 6, and leaves at 9, and a third user logs in from 10 to 11, how many is that? Two, or Three, or none? :) Or maybe you mean average users over time, which from 5 to 11 pm on that example is (calculating... 3 + 3 + 1 over 6) 2.333 or so. -- Randal L. Schwartz, Perl hacker	[reply]
Re^4: How to Iterate to Identify Concurrent Users by Anonymous Monk on Oct 26, 2007 at 01:00 UTC
Re^5: How to Iterate to Identify Concurrent Users by GrandFather (Saint) on Oct 26, 2007 at 01:30 UTC
Re^3: How to Iterate to Identify Concurrent Users by ikegami (Patriarch) on Oct 26, 2007 at 00:32 UTC
Do you mean "the maximum number of concurrent users over the span of the log file"? The total of one number is rather uninteresting, especially when you're not clear on what that number is.	[reply]
Re^4: How to Iterate to Identify Concurrent Users by Anonymous Monk on Oct 26, 2007 at 01:18 UTC
Re^5: How to Iterate to Identify Concurrent Users by ikegami (Patriarch) on Oct 26, 2007 at 01:24 UTC
Re^5: How to Iterate to Identify Concurrent Users (SNMP) by fenLisesi (Priest) on Oct 26, 2007 at 11:59 UTC