Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

A CGI chained-event theory question

by JPaul (Hermit)
on Feb 04, 2002 at 03:29 UTC ( [id://143152]=perlquestion: print w/replies, xml ) Need Help??

JPaul has asked for the wisdom of the Perl Monks concerning the following question:

Greetings all,
I have a bit of a conundrum on my hands and I've been mulling over it for a few days now and decided I'd like to get some other expert opinions on the matter.
Background: A website that keeps track of users by cookie sessions has a submit form. At the bottom of the form is a checkbox called "Add another one".
The idea is that you fill in your submit form (we're entering details on car parts), and if you want to add another stock item, you check the checkbox, and hit the 'Submit' button, and it gives you another form to work with, ad nauseum until you don't check the 'Add another one' checkbox, Submit, and it goes to a summary page.

I want to build a master summary with all the items entered by one user at each grouping. What I am currently doing (just to get it running) is I keep an internal ID of this series of items you submit, giving the same "batch ID" to each item submitted in a row, and when you hit the final Submit button, it pulls everything out with that Batch ID and groups them together into one entry for the Summary Page.
Now this is all well and good, and if nothing goes wrong, works nicely - but what if something does go wrong? What if the browser crashes, or the user hits another link (etc...) - and we lose that chain. Now all the items have been entered into the DB - but I don't know when to go and group them... Not to mention keeping batch numbers is pretty ugly

What is the best way of doing this?
I have thought of datestamping the items, and then every half-hour grouping them and making the summary - but if you're still entering stuff when that half-hour hits, then whatevers entered after that isn't grouped.
I could also do the grouping when the user logs out - But what if they only enter one item a couple times a day, and logout in between - so I end up with a bunch of single items in each summary?

I hope this is relatively clear on what I'm doing, and what I'm asking. All suggestions would be appreciated.
The idea is to group these items together in the most intelligent manner... But I'm not expecting to group EVERYTHING they did together, rather maybe make up a list only once a day - but still, I want it to appear somewhat real-time as well...?
Am I asking for just too much functionality for what should be a simple feature? Are these too many conflicting "I wants" that just can't go together? Am I completely confusing?
My thanks for your opinions on the matter,

JP
-- Alexander Widdlemouse undid his bellybutton and his bum dropped off --

Replies are listed 'Best First'.
Re: A CGI chained-event theory question
by rob_au (Abbot) on Feb 04, 2002 at 10:13 UTC
    I actually think that this is a very interesting question and am surprised not to see more comments already in this thread ...

    Based on the fairly simple overview of the process in place currently, I think the solution lies within the means by which you store your session and input data. From the overview given, I am guessing you are making use of client-side storage of the data entered within the 'session', hence the problem when the input train is interrupted. This is where your biggest mistake lies ... Client-side cookies should not be used for any 'real' sessional data storage - Instead, the client-side cookie should only store a unique, and preferably, non-predictive and pseudo-random, pointer to the storage of this data on the server.

    How can this work with the given scenario? Well, the answer to this question will depend very much upon the specifics of client cookie management and server-side data storage with your script. However, imagine the following as an order of operation for your application ...

    • Client connects to initial interface page of your application where a client-side cookie is set (CGI::Cookie) - The cookie at any stage only contains a unique, non-predictive and pseudo-random identifier (think Digest::MD5) for the session; no data entered at any point throughout the entry process is stored in this client-side cookie.
    • The client progresses through the entry pages and data entered is stored server-side, updated at each step throughout the process. This server-side storage need not be complex and could quite simply take the form of a serialised DBM hash, indexed by the unique key stored in the client cookie.
    • At the completion of the entry process where the client unchecks the box for further data entry, the server-side data collected for the user session is submitted to your database in the existing manner.
    •  

    This method could be streamlined through the use of a server side session management and data storage solution such as Apache::Session - Indeed, if you really wanted to, you could actually take out the final step of this process by expanding your existing database tables to include sessional identification such that data entered during the session could be submitted to and updated within the database as entered. The expansion of data storage in this manner, paired with a second table matching up client environment details, collected at the time of session initiation, with the sessional identification would allow you to track user-entered data quite precisely.

    The approach of pooling data from sessions into half-hour blocks for submission, collectively referring to them as a sessional entered unit, really doesn't scale too well to a system where you might have dozens of people entering information at once. Furthermore, by collecting the data server-side in this manner forms the rudiments of a 'true' sessional system, similar to that described previously.

    Good luck

     

    perl -e 's&&rob@cowsnet.com.au&&&split/[@.]/&&s&.com.&_&&&print'

Re: A CGI chained-event theory question
by screamingeagle (Curate) on Feb 04, 2002 at 06:04 UTC
    well, you could make a summary page which displays all the records added for the entire day , grouped by half-hour intervals.This page could be accessible by the user whenever he/she wishes to view the orders added so far...
    additionally, if the user wishes to see orders submitted on some other day, u could add a search portion to the summary page, the results being displayed grouped by the datestamp in 30 minute intervals,of course.
    this way, u dont have to worry about when the user logs out, of if the browser crashes, etc
Re: A CGI chained-event theory question
by termix (Beadle) on Feb 04, 2002 at 14:33 UTC

    You use the phase "user logs out" so I assume people have accounts on this service. There are a number of ideas on the table already with regards to sessioning, storing only session IDs on client cookies and using server storage for the lists.

    I have always tried to associate the 'timeout' for each persion (session) separately and base it off of the inter-message delay. So, anyone who is working on a list will lose the list if they leave it unfinished for a day (e.g.). Depending on the nature of these lists, you might also want to use a 'list' key in addition to a session key, allowing the same visitor to manage multiple lists.

    Another thing we found when building sites for clients is that even within six months the traffic patterns on their sites would change (often increase) and hence it was highly adviseable to link that 'timeout' to an actual resource parameter.

    For example, the 'timeout' would vary b/w 1 hour, 6 hours, 12 hours, 24 hours; based on a function that took into consideration the disk space available, the number of new lists being created over the past three hours, the 'authority' level of the user, and the calculated decay rate of information (how many disk space will be freed up in the next three hours). Once a time out is assigned, and the user is told about it, it shouldn't change.

    If you have a large number of users hitting the site, specially if users can sign on automatically, this measure is a semi-decent measure against running out of resources because of heavy use (or malicious use).

    And of course, move any finished lists to a separate dedicated data storage.

    -- termix

Re: A CGI chained-event theory question
by shotgunefx (Parson) on Feb 04, 2002 at 11:02 UTC
    Is this a site that anyone can access or just registered/authenticated users? I mean, if you don't know who people are, and they never finish submitting, then I would think their input doesn't matter.

    If they have usernames, couldn't you collect the data by that? Then just remove stuff from the user's batch/list at some specified interval for processing.

    -Lee

    "To be civilized is to deny one's nature."
Re: A CGI chained-event theory question
by JPaul (Hermit) on Feb 04, 2002 at 18:23 UTC
    Greetings again,
    Fortunately my problem is much simpler than the replies above address, but it shows that my post that I reread and rewrote 30 times before finally posting just didn't contain enough information to be clear.

    My users have accounts in the web system, their cookies contain only their account names (And, in fact, an MD5 checksum of that name and an internal value to prevent cookie tampering).
    All data is stored on the server, each form being inserted into the DB (MySQL) after each submit, it is only after the final submit that the data is collected together (since currently they all share a batch number allocated when you do your first submit), formatted and output into the Summary Page.
    All users input is important and should appear on the Summary Page.
    What I'm wanting is to have a Summary Page that, after the final Submit, contains the most recent groupings, BUT if a user enters a bunch of items now, and then another bunch 20 minutes later that they should somehow be amalgamated into a single summary saving untidy splotches of their days input all over the Summary Page.

    I believe, however, that the answer has come to me - simple, and relatively concise.
    Do away with Batch IDs, but put a datestamp on each DB entry. After each Submit, look around for any entries in the Summary Page table from my user in the last 12 hours or so, if found, append my latest item onto that. That way its realtime (as soon as they submit, its attached to a summary somewhere) and if they spread out data entry over a day, it will still be attached to the same Summary for tidiness.

    You may find it amazing that it took me days of brainstorming, and eventually the 30 minutes+ it took me to form the above posting to ask you for your opinions... So sue me.
    However, in doing so (posting), I seem to have managed to collect my thoughts, and as all good ideas do, came to me last night at 3am in bed.

    My thanks for your good answers (even if not entirely related to what I really wanted to say), and bothering to read my drivel.

    JP
    -- Alexander Widdlemouse undid his bellybutton and his bum dropped off --

      A couple of points:

      Session ID's should be dynamically generated as rob_au suggests and stored in a session table for better security. If your database is compromised, someone can spoof all your users. It just adds another layer of security to your app. Generate another cookie each time they log in. Make sure you provide an option for the user to logout also.

      It sounds like you could use a 'staging' area for your data. (if the entries are not required immediately). How about this:

      1. Create a table called stage with a username and data column.
      2. Each time the enter button is pressed go get what is already has already been stored.
      3. You store your incomplete list in the database by using Storable.pm as an array
      4. check to see if the no more data checkbox has been checked
      5. If the check box is selected thaw everything out then chuck it in the database as you need. If not thaw it all out, and  push @existing_data, @new_data, freeze it, and put it back.

      If you use this method, you are not time dependant at all (what happens if they want to continue the list after your "Time Out" period? (go to lunch, come back after the weekend et al))

      If the inputted data is required straight away, then you could do a similar thing, except add a "complete" column to your items table.

      To provide a summary, you scan all the rows belonging to your user with the flag set saying the list is not complete. On displaying the summary, you set all the flags to "The list is complete".

      Obviously this is a little simplistic, if you flesh it out a bit, it should work pretty well.

      HTH

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://143152]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (5)
As of 2024-04-19 09:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found