Re-Factoring: Philosophy and Approach

George_Sherston has asked for the wisdom of the Perl Monks concerning the following question:

Alright, it had to come to this in the end, but it was dragonchild's challenging note that stimulated me to raise this question.

I'm about a third of the way through what seems to me to be a huge script, and I've come to a natural point at which I can stop and take stock. Basically how it works is, users get emails when they have to do something. The email contains a link to the script with a query string of CGI parameters. When the users click on their links they get their own personal query page in which they enter the information they're being asked for. The result is that I've got two thousand lines of code which serves up and processes a series of mini-forms, each unique, but all having lots of things in common, both as to form and content.

I knew from the outset that I was going to want to re-factor this, but I thought I'd write a big gobbet of it so that I had something to work with... and that's where I am now. Before I write the remaining two thirds I want to take what I have and split it up into intelligent re-usable subroutines and tidy them away into modules. Then the remaining 2/3 should be a bit quicker to write.

At the moment what I did was I paged down through the script and kept an eye out for repeating patterns (and indeed remembered where I had put some things that looked the same as other things) and I found one thing I was doing over and over and I put it in a module. But that doesn't seem like a very scientific way to go about it.

Perhaps the scientific way to go about it would have been to conceive the whole thing in modular form at the outset - but that's beyond me at my current level of experience. I hope when I've finished this project I'll have enough of a feel for how this stuff works that I could do that for the next one - but at present I decided the only way I cd do it was by writing a bit of it out and then condensing that.

So I'm appealing to wise monks for your views on how to approach this task, in the most general terms.

In the inevitable compromises involved in reducing several different activities into a uniform format, what do you trade off for what?
Where do you start in deciding what goes in which subroutine?
Do you use one module or several?
What's the optimum size for a subroutine?
If you've got one thing you do over and over again in two slightly different ways, do you want one subroutine with an if or two subroutines?
And anything else?

I've put the WHOLE of my existing script here so you can see what I'm up against. Of course, I'd love to have comments on individual bits of the code - but my main interest is in getting an over all approach to boiling it down so that (A) it works efficiently and (B) it's easy to expand / edit.

§ George Sherston

Back to Seekers of Perl Wisdom