in reply to Multiple date ranges

actually it's a pipe delimited Unicode file, but don't worry about that

That is your main problem. Move to a relational database (SQL) and your problem is simplified to simple SQL syntaxt.

You'll be better off with a lightweight SQL database like DBD::CSV or DBD::SQLite everytime (if you cannot afford Mysql/Pgsql and the like).

Replies are listed 'Best First'.
Re: Re: Multiple date ranges
by AndyH (Sexton) on Feb 03, 2003 at 13:08 UTC

    I knew I should have qualified the statement about the pipe-delimited file. This is one of those "*I* wouldn't have done it this way, but I've inherited this system" problems.

    In the long-term, I indeed do intend to move it to a DBI/DBD based system, but there are all sorts of other details I won't bore you with that are going to make that a timescale of months, not days. Meanwhile, I still need to solve this in a non-SQL way using what I have: hashes made by reading the pipe-delimited file ...

    A.

      What format are the timestamps in?

      Convert the timestamps to seconds (use any means neccessary).

      Create a datastructure resembling

      $foo{'bar'} = [ 1, # first joined 2, # left 3, # joined again 4, # left 5, # joined again ];
      If a user has an odd number of timestamps, he's still a member.

      It's fairly obviouls how to go about querying for other values.

      If the code is clean, switching to DBI needn't take more than a few days (i got a feeling its not though *shudder*).

        The timestamps were entered as free text in a web form. No checking of any kind - pick a format, any format. I have a roughly 200 line perl regexp script that has got (nearly)all of them into ISO format (YYYY-MM-DD).

        Spotting who is currently a member is easy (apart from anything else, that info exists in another database I didn't tell you about). What I was looking for was a more elegant way of handling the other queries, e.g. using Date::Range, than a brute force approach...

      You don't need to solve this in a non-SQL way.

      Take this opportunity to port your database to SQL (build it out of the pipe file).

      Then run SQL queries (which is what you're trying to do), so by the time your frontend catches up (the other code), you already have a nice database/search backend all ready.

      Using CGI::Application the re-development of your frontend can take lots and lots and lots less time.

      Using DBD::SQLite can save you the setup of a real database.

        I looked at CGI::Application and am looking again, as there is an updated version just out, but I am trying to solve the (many) information problems first, before I set about improving the CGI ...

        I also looked at SQLite, but it's not clear whether it handles Unicode properly or not. The docs don't mention it; the only reference is in the changes file, where it says "fixed Unicode tests".

        Another problem I have is that some of the data is in XML format and I'm sure you know that mapping XML onto the RDBMS model can be a sure-fire headache generator.

      use DB_File; with the $DB_HASH and allow keys to have multiple values.

      Make the the keys the date stamps, and the values user names.

      Now you can do partial matches, and can have a many-to-one relationship (on 2003-02-01 , 10 users joined, 4 left)

      Since you can have multiple values, you can make the values contain if a user left or joined (pipe it to me), or you can have 2 hashes (%joined, %left), or simply prefix the date strings (left-2003-02-01 or joined-2002-09-22) so you can keep a single %hash.

      I'd do it this way, cause I love DB_File.

        I mean one-to-many ;)

        DB_File is fast.

      You don't need to solve this with SQL if you don't want to. In perl, TMTOWTDI. I would suggest you look at the module Date::Set. You should be able to add to the sets like this:
      $a->event( start => '19971024T120000Z', end => '19971025T120000Z' ); # one period $a->event( start => '20001101T120000Z'); # open-ended
      Simple set-inclusion queries will tell you who is active on a given date. SQL might be better in the long run, but something like this would get you going.

        Thanks for this suggestion. Do you think it will scale OK with the 4500+ records and the fact the membership dates go back to 1949?

        Just checked the info for Date::Set

        (no support for years outside the 1970-2038 range).

        So, that's no good for me, as I have dates going back to 1949 ...