Re: Suggestions on storing loosely similar data

Yep this sounds more like a database design issue, not a perl issue. If I was to design such a feature, I would create a user access log table somewhat like this:

Table: access_log
seq_no	Sequence Number
uid	User ID
access_time	Timestamp
report_id	Report ID
report_option	Report Option

Then it's just a simple matter of constructing your SQL queries to make use of the access log.

By the way, you are using a database backend, aren't you?

(I missed the sequence id in my post earlier, here's the amendment, thanks! And of cause this is to assume the traffic volume is relatively small.)

Comment on Re: Suggestions on storing loosely similar data

Replies are listed 'Best First'.

Re: Re: Suggestions on storing loosely similar data
by zby (Vicar) on Sep 11, 2003 at 07:40 UTC

The root of the problem is that this table is still not enough normalized. I would make it two tables one for uid, access_time, report_id and the key seq_no and another for just report_option and key seq_no.

[reply]

Re: Re: Re: Suggestions on storing loosely similar data

by alienhuman (Pilgrim) on Sep 12, 2003 at 15:31 UTC

Let me see if I follow. This table is an access log (reports run):

seq_no (auto incremented key)	uid	access_time	report_id
this record	who	when	which report

And then there's another that tracks the individual options for each report in the access log:

seq_no (key)	report_options
record lookup	string of report options

Is that what you were proposing? It seems good to me, though I want to percolate on that for a bit (since I'm still getting my head around how this access log is going to be used... and of course, you all have no idea :) ).

So my code would do this to figure out how many reports were run by each uid w/in a date range (excuse my psuedo code in advance):

(psuedo code)
select count(seq_no) where uid like $uid and report_id like $report_id
+ and $access_range1 > access_time > $access_range2
[download]

And to check to see if a user was re-running a report (ie running a report w/ identical options w/in a certain time period) I'd match report_id, uid and access_time, to get seq_no that matched. Then I'd check the suspected re-run report options with report_options in table2.

Sound about right? Is there further normalization that could happen? Thoughts on splitting the report_options out like this? Anything else? I really appreciate your thoughts on this.

[reply]
[d/l]

Re: Re: Re: Re: Suggestions on storing loosely similar data

by zby (Vicar) on Sep 12, 2003 at 16:22 UTC

I was commenting the other guy design and I understood that the last column is Recor Option - the singular form. That is that for each option you have a different record. Then you need the normalization. When you group the options into a string the table is normalized (enough at least for most needs). Sorry if I misled you.

[reply]

Re: Re: Suggestions on storing loosely similar data
by bean (Monk) on Sep 11, 2003 at 03:30 UTC

This looks good, except I would give every record a sequential id - otherwise you have a four-column key...

[reply]