in reply to Suggestions on storing loosely similar data

Yep this sounds more like a database design issue, not a perl issue. If I was to design such a feature, I would create a user access log table somewhat like this:

Table: access_log
seq_noSequence Number
uidUser ID
access_timeTimestamp
report_idReport ID
report_optionReport Option

Then it's just a simple matter of constructing your SQL queries to make use of the access log.

By the way, you are using a database backend, aren't you?

(I missed the sequence id in my post earlier, here's the amendment, thanks! And of cause this is to assume the traffic volume is relatively small.)

  • Comment on Re: Suggestions on storing loosely similar data

Replies are listed 'Best First'.
Re: Re: Suggestions on storing loosely similar data
by zby (Vicar) on Sep 11, 2003 at 07:40 UTC
    This way you would have one record for every Report Option for every report. If you use the timestamp function in the insert clause it would be possible that the options get a different timestamps. This is not a desired situation.

    The root of the problem is that this table is still not enough normalized. I would make it two tables one for uid, access_time, report_id and the key seq_no and another for just report_option and key seq_no.

      Let me see if I follow. This table is an access log (reports run):

      seq_no (auto incremented key)uidaccess_timereport_id
      this recordwhowhenwhich report

      And then there's another that tracks the individual options for each report in the access log:

      seq_no (key)report_options
      record lookupstring of report options

      Is that what you were proposing? It seems good to me, though I want to percolate on that for a bit (since I'm still getting my head around how this access log is going to be used... and of course, you all have no idea :) ).

      So my code would do this to figure out how many reports were run by each uid w/in a date range (excuse my psuedo code in advance):

      (psuedo code) select count(seq_no) where uid like $uid and report_id like $report_id + and $access_range1 > access_time > $access_range2

      And to check to see if a user was re-running a report (ie running a report w/ identical options w/in a certain time period) I'd match report_id, uid and access_time, to get seq_no that matched. Then I'd check the suspected re-run report options with report_options in table2.

      Sound about right? Is there further normalization that could happen? Thoughts on splitting the report_options out like this? Anything else? I really appreciate your thoughts on this.

      AH

        I was commenting the other guy design and I understood that the last column is Recor Option - the singular form. That is that for each option you have a different record. Then you need the normalization. When you group the options into a string the table is normalized (enough at least for most needs). Sorry if I misled you.
Re: Re: Suggestions on storing loosely similar data
by bean (Monk) on Sep 11, 2003 at 03:30 UTC
    This looks good, except I would give every record a sequential id - otherwise you have a four-column key...