in reply to Refactor and simplify
Two stories:
After two weeks of trying to figure out where the error lie, I re-wrote the routine from scratch, reducing it to < 200 lines in the process. Regression testing showed the that it fixed the current APAR and hadn't re-instated any of the earlier ones. All test passed.
Submitted for approval, the change was rejected as "too extensive" and I was instructioned to reduce the volume of the change, by reverting to the 900 lines and only changing those that required changing. When I explained that the original code was so complex and simply had too many branches and special cases for me to work out which combination of factors caused the original problem, the APAR was closed as "Un-reproducible"!
I suggested a fix which was to pre-filter the input and detect the circumstances in question and adjust the structure to an equivalent, but different structure, that was known not to cause the problem. This went into production while the original code was investigated further.
The problem was eventually tracked down to the of depth of recursion required to process the structure of the input causing a stack frame (limited to 64k by the 386 segmented architecture) to overflow. The "right way" to fix the problem was to re-compile the library to use the huge model, but various assumptions in the original design meant that this entailed an almost complete re-write of the entire system to use the huge model.
A very pragmatic decision (by an enlightened manager) was taken to generalise the pre-filter so that on the rare occasions when the structure of the input became so complex as to push the recursive routine into stack overflow, the structure was converted to two simpler, equivalent structures. This effectively doubled the headroom and prevented the problem from occurring. It was an easy patch, but saved a huge amount of re-development effort, time and crucially money.
Oh that all management decisions where so pragmatic.
Put your idea and reasons to your management. Preferably, grab some production input and output data and set up your testing to show that your refactoring replicates the production code in every way -- including replicating any bugs that it may have as a first pass. Once your refactoring can be used as a replacement for the status quo, it becomes much easier to tackle any bugs it has one by one and ensure that fixing them does not have undesirable knock-on effects on other parts of the system.
Good luck, and may your management be pragmatic.
|
|---|