This answer isn't quite what you asked, but might be useful to hear.
It sounds like your legacy system has a whole lot of special use cases that have been identified over the years and received custom features to handle them. While the code might be a technical-debt nightmare, the business logic is probably fairly valuable, and replacing it with a completely new system that would need tailored for these use cases might actually be a worse option than paying off the technical debt.
Here's how you can pay off the technical debt:
Your first main goal should be to move all the logic as-is (as much as possible) to a module with a sane API. In a broad sense, the per-client configuration would be the attributes of the object, there would need to be a data store that keeps a history of the notifications sent to that client, and then the main method of the module would be one that delivers new events. If this code is as bad as it sounds, it probably sends the emails directly. You'll need to factor that out so that it "emits" email data in some way that can be passed to a separate mailer object. One final "hidden" parameter is the current value of 'time', which you need to factor out as well.
In a procedural sense, you would:
- Create a new package MyApp::Notify (or any more specific name)
- Copy all the event code into a "handle_event" method of this new module
- Going line-by-line through "handle_event",
- If the code references a configuration variable, create an attribute on the object for that, and document what it is and who needs to set it.
- If the code sends a message, change that to returning a value of some sort describing what needs sent.
- If the code looks at the database or something to see the last message sent for this event, convert that to a more generic "history lookup" method on this object
- If the code makes references to the current system time, either change that to use a 'time' parameter passed to the handle_event method, or time attribute on the event object
- Avoid any other refactoring in the event logic. Focus on this one change first before getting carried away with other code cleanup.
- Find every place that emits events, and have it send that event to this object, created on demand for the current customer. The best way to do that depends on what framework you're using.
- Write the adapter that takes the return value of handle_event and generates the notification (email, SMS, etc)
- Do one last painful manual test of all of this, and deploy it. Fight through the bugs you introduced for the next month or so.
- Finally, begin writing tests for your event module. Create lots of examples of (config, example history, event, expected output) and verify that handle_event generates the correct return value for each event under each set of circumstances. Remember the part above where you converted history lookups to a method of the MyApp::Notify object? That's so now you can mock that method to return the history you want to be tested.
- Write more tests
- Write even more tests
- Begin refactoring the code to be less spaghetti. Use your test suite to validate that you didn't break anything.
Your situation is probably more complicated than that, of course, but that's a general pattern for refactoring a monolith of nasty code which has worked well for me.
Another pro-tip that you might find useful is to use a spool table in your database for outgoing messages. So, instead of the code emitting a message immediately, the code instead writes a message (or parameters to generate one from a template) into a database table with various useful columns like when the message should be delivered, from, to, what caused it to be queued, and so on. Lots of benefits here:
- You can commit the message as part of a database transaction, so either the data is committed and a notification is queued to be sent, or the transaction fails and no notification gets sent.
- You can queue messages to be sent at a later date, so you can release a large batch of messages all at once. If the process that generates the messages fails halfway through, you have a time window to delete the queue for that process and start over. Saves sending embarrassing apologies to customers.
- You can generate the data for a email template independent of the template. This gives you a chance to correct a template before they get mailed without having to re-generate the messages.
- If some rows of the spool table fail to get delivered, you can have subsequent processes analyze and deal with that instead of trying to trap mail-sending errors in the middle of some other application logic. (Definitely add a monitor to alert you about overdue spool table rows.)
- You can write tests that verify some application logic generates a row in the spool table, instead of needing to test against a fake SMTP server.
Hope that helps.