The behaviour that we're looking at here is more-or-less defined by a set of 'black box' algorithms that we know precious little about within the software being used to control the Y-values in the process (this software is pretty old and has been through PDP, VAX and now PC platforms with only minor revisions(!)). There are actually many data inputs that affect the way the Y-values move but as they're in 'black boxes', we're more-or-less not interested in their workings and we're just looking at the output.
We start by defining some 'tuning parameters' for the process to use. For simplicity, let's call them A, B1, B2, C and D. Note that B2 is optional. A jumps directly to B1, B1 directly to B2... but after that, from B1 or B2 to C is a continuous distribution, as is from C to D. In the example data I've posted here, the values are: A=44, B1=90 (B2 is not used), C=100 and D=100 (Oop! C would normally be ~10% less than D but no matter).
When the process starts at the beginning of the day, these values, together with various other inputs are processed and the Y-value is produced. In some approximate way, depending on another few inputs, the next Y-value is calculated based on the previous Y-value, so we have something like: Y1 = f(Y0, A, B1, B2, C, D + others) and X1 = X0 + ~Y1.
When we do the analysis, we generally have a complete day's worth of data available... but in terms of tuning the operation (that is, adjusting the A, B1, B2, etc values), we might like to run the analysis on incomplete data; for example, once we get to 10:00am or so, we might run some analysis on today's data and re-define the A, B1, B2, etc values on the fly... but that's not the primary purpose of what my application is about.
After all this, we're basically looking at the Y-value range A to D (or the highest value achieved in the day OR 'so far') and we're trying to determine the point between the time where the 'rise' started in the morning and where the Y-values stabilized at a 'maximum value' in the morning. Once we determine that time range, we can make further decisions about whether to use the mid-point, the 3rd points or whatever, to say when the real 'load' on the system 'started'.
Once we have this basic algorithm worked out, we'd apply it to the other time period transitions throughout the day. So we'd end up having something like this:
AM Peak: Start Time (T1) End Time (T2) High Off-Peak: Start (T3 = T2) End (T4) PM Peak: Start (T5 = T4) End (T6)
Out of all this, we're simply trying to determine a relatively constant 'start' and 'end' point for the various peak periods during the day, so that we can analyze maybe some months of data with a consistent 'start' and 'end' time.
We would also like to automate how we decide on the A, B1, B2, etc values in response to what the actual loading on the system is. At present, we make arbitrary decisions on what those values may be and they frequently don't give the performance (stability of Y-value) we expect at various times.
I hope this clarifies what we're trying to do...
Again, many thanks for everyone's input - it's been invaluable.
In reply to Re: How to Determine Stable Y-Values... or Detect an Edge..?
by ozboomer
in thread How to Determine Stable Y-Values... or Detect an Edge..?
by ozboomer
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |