Data structure for log file processing

husker has asked for the wisdom of the Perl Monks concerning the following question:

Ok I'm processing a FlexLM license manager log file so I can show our management usage trends for our insanely expensive CAD software licenses. I want to calculate how much time, on average, a person is using a particular software license. The log format, briefly, is:

16:32:45 08/24/00 ms7 OUT MasterModeller user1@host1
16:32:48 08/24/00 ms7 OUT Drafting user2@host2
16:48:00 08/24/00 ms7 IN MasterModeller user1@host1
[download]

etc...

I was going to make a hash with keys of ProductName and user@host, and store a string consisting of the date and time for each OUT record, and then do some time-delta calculations when I saw the corresponding IN record by going back to the hash and retrieve the OUT time I stored. This worked fine until I realized that any given user on a host can check out the SAME product more than once ... i.e., they can have two Drafting licenses checked out simultaneously. My current hash would break, overwriting the first OUT time/date string with the second one.

I thought about just concatenating time+date+product+user@host into one string and pushing it on an array, and then scanning that array when I find an IN record and look for that same product+user@host. However, seems like thats going to involve a lot of serial scanning, pattern matching, and string decomposition.

Does anyone have a better suggestion? I can provide more details if required.

Comment on Data structure for log file processing Download Code

Replies are listed 'Best First'.
RE: Data structure for log file processing by nuance (Hermit) on Aug 24, 2000 at 20:12 UTC
I was going to make a hash with keys of ProductName and user@host, and store a string consisting of the date and time for each OUT record I thought about just concatenating time+date+product+user@host into one string and pushing it on an array, and then scanning that array If you pushed that string from your first suggestion onto an anonymous array, it would be much easier to process. You basically end up with a hash of anonymous arrays that are more easily searched. *Nuance*	[reply]
RE: RE: Data structure for log file processing by husker (Chaplain) on Aug 24, 2000 at 22:21 UTC
Thanks. That got it working! I was so close, yet so far. :)	[reply]
Re (tilly) 1: Data structure for log file processing by tilly (Archbishop) on Aug 24, 2000 at 20:08 UTC
Well merlyn is really fond of Parse::RecDescent for tasks like parsing log files. I would therefore suggest taking a look at that first.	[reply]
RE: Re (tilly) 1: Data structure for log file processing by merlyn (Sage) on Aug 24, 2000 at 20:39 UTC
Well, not for something this simple. There's practically no variation in the lines. A simple hash or perhaps hash of hashrefs should be all that's needed here. -- Randal L. Schwartz, Perl hacker	[reply]
RE: RE: Re (tilly) 1: Data structure for log file processing by husker (Chaplain) on Aug 24, 2000 at 22:23 UTC
Yes I took a glimpse at that module and it did seem a little too much like killing flies with a howitzer. :) However, should I ever need to parse something less uniform, I'll be sure to remember where to start looking.	[reply]