punch_card_don has asked for the wisdom of the Perl Monks concerning the following question:

Mulching Monks,

Is there any good reason to prefer a Perl or a Mysql solution to the following issue:

A user is going to submit multiple option choices via a web form. Say one through N. They will be submitting these choices regularly, and I want to keep track of what their choices were each time. They will be stored in a Mysql database, but I see two possiblities:

1. Give this update event a unique identifier, create an 'options' table with columns 'update_id' and 'option', and store a separate update_id - option_id pair for each option chosen. That is, if the user selects 5 options on Monday, ther ewill be 5 rows inserted into the table, each with Monday's update id. To retireve the options selected on Monday, I select all rows that have Monday's update id.

2. Pre-agglomerate the options in Perl into a formatted string, such as 1|2|5|8|13|55, and store that in a single inserted row. To retrieve Monday's selections, I select the single row with Monday's update id and do a split.

There are about 1,000 users, and each will be making about 6 updates per year, making about 5 to 10 option selections per update. And I have to keep the data for 7 years. So version 1 could end up with ~1/4-million rows, where as version 2 will have ~1/5 ~ 1/10th that, but require the extra Perl.

Speed of retrieval is not critial, per se, but neither will it look good if my system chunks along like a tug boat. So, which is likely to be faster?

Thanks.




Forget that fear of gravity,
Get a little savagery in your life.
  • Comment on Perl or Mysql to store multiple choices?

Replies are listed 'Best First'.
Re: Perl or Mysql to store multiple choices?
by Joost (Canon) on Mar 29, 2007 at 21:11 UTC
      After reading the link provided, I think this verges on genius for my application.

      Thanks.

      The only real potential for problems I see is what happens if the options change down the road - looks like with this I'm into modifying the table.




      Forget that fear of gravity,
      Get a little savagery in your life.
Re: Perl or Mysql to store multiple choices?
by geekphilosopher (Friar) on Mar 29, 2007 at 21:13 UTC
    A database with 250 000 rows is considered small in DBA circles. Use the database for what it's good for - don't reinvent the wheel. The developers at MySQL are probably better at writing efficient queries than you are, especially since they use C to do so. Store as much as you can in a well laid out database, with as many columns as make sense - it'll be faster now, and it will certainly be faster in the future when you think of a new query that you want to execute.
      Yes, very sensible replies, both.

      But now let me complicate things: There are four questions, each with 5 or 10 possible choices, and multiple choices are allowed. Each update, the user answers all four questions.

      So, I could have four separate tables, Q_1 to Q_4, each with 'update_id' and 'option' columns. To retirieve the selections of Monday, I have either four queries to create, prepare, execute and retrieve, or a four-table join to do. Versus one table with update_id and Q_1 to Q_4 columns, from which I do a single query and four splits.

      I'm unsure that four queries, or a four table join, is going to be faster than a single query. Each query is a call to the DBI - an extra layer. Whereas a split is pure Perl.

      But I agree that it's certainly not a purist's approach to database design.




      Forget that fear of gravity,
      Get a little savagery in your life.
Re: Perl or Mysql to store multiple choices?
by talexb (Chancellor) on Mar 29, 2007 at 21:41 UTC

    It sounds like you're asking if you should compress multiple selections into a single row or not -- with the database select being a given.

    Based on that, compressing and decompressing the data (with a join and split) would be my suggestion; that's got to be faster than doing an INSERT/UPDATE and then a SELECT for multiple fields. I expect the code would look cleaner too.

    Although I've read benchmarks, I've never used Benchmark, but it would definitely tell you what's faster, and then you'd have the ability to balance that speed with the associated complexity of the code.

    Alex / talexb / Toronto

    "Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds

      Yes, that's exactly what I'm talking about. And what I'm thinking.

      When a user selects options 1,4,7,9, and 15 on Monday, either:
      a) store it as 5 separate insertions of update_id - option pairs
      b) first join them into a delimited string and do one insert

      Then on retrieval, either:
      a) select all rows with Monday's update ID (which means a while loop)
      b) select a single row and split the retrieved string

      I expect the first reaction to be "let the DB do its work; it's undoubtedly smarter and faster", but I suspect that the counter-intutitive ins in fact true: that the extra DBI layer, with its preparation and the retrieval loop, is going in fact ot be slower than a pure join or split, especially when it gets more complicate with multiple questions to store and therefore multiple queries to do.




      Forget that fear of gravity,
      Get a little savagery in your life.
        punch card don:

        With the speed of today's computers, and the tiny size of your database, speed isn't going to be an issue. Basically, you should write the cleanest code you can, to make maintenance simpler. Only after you have something and it proves to be too slow (which I don't expect to see in this case), do you look for optimizations.

        I'm not advocating being purposefully wasteful...It's just that programmer time is much more expensive than computer time....

        --roboticus

Re: Perl or Mysql to store multiple choices?
by chrism01 (Friar) on Mar 30, 2007 at 00:19 UTC
    I would definitely go with the DB oriented soln:
    Create 1 table with cols for update_id, option, qn_num and index as needed. The main performance issue is going to the DB, but a single table select with indexes is v fast and as noted, 250k rows isn't that many.
    The while loop to retrieve will not take a noticeable amt of time.

    Cheers
    Chris

Re: Perl or Mysql to store multiple choices?
by hangon (Deacon) on Mar 30, 2007 at 11:36 UTC

    Personally, I would recommend using separate rows. Compressing your data into a string will defeat your ability to make ad-hoc queries on the database. Trust me on this - when one day your PHB wants to see the data in a different manner than you describe, you'll be happy you did.

      At the risk of being a flip-flopper, yes, this consideration came to me in the night. If I do the string compression, then some day when we want, for example, to do stats on how many people chose option x, then I'll have to select the entire database and parse out the answers myself.

      OK, I'm sold.




      Forget that fear of gravity,
      Get a little savagery in your life.
        I agree, that you should use individual columns for all data. If you ever need specfic data out you then have to process the entire string and split it out. You will save yourself a lot of time by setting it up right the first time rather than wishing you had done it that way the first time.