in reply to Style Question: small and modular, or big enough to do the job in one piece?

rah,

It really depends. Asking if 200 lines is too much is kinda like asking is 200 coins too much. It all depends. Do you have 200 pennies, 200 quarters or 200 gold pieces, or a nice mixture? 200 lines means really nothing. When you're looking at those two hundred lines, ask yourself these questions:

How many of them are exact copies?
How many of them are similar copies?

What you want to do here is figure out how much "copy-n-paste" re-use you have - those are prime canidates for refactoring into subroutines.

Once you have the "copy-n-paste" out of the way. Take a look at what's left and ask your self these questions about each subroutine in term (and don't forget the implicit main subroutine).

Is this beastie cohesive - does it do one thing very well or is it all over the place
Is this beastie coupled - does it depend on global data or does it poke at other subroutine's data.

Shoot for high cohesiveness and low coupling (the holy grail of programming). The great thing about looking at cohesiveness and coupling at the functional level is that it can really help you refactor code into an OO level - hmmm this set of five methods work on this one data structure while this set of 3 work on another - eureka ... module discovery.

Now once you've done all that, your 200 lines of code may now be 300 lines of code - is that any worse? Well from a run-time perspective - probably. Calling subroutines or loading modules and calling methods cost you. But that cost is hopefully down in the noise and what you really gain is ease of maintainence and ease of extending when new batch jobs come along.

-derby

  • Comment on Re: Style Question: small and modular, or big enough to do the job in one piece?

Replies are listed 'Best First'.
Always focus on maintainability...
by dragonchild (Archbishop) on Feb 14, 2002 at 15:41 UTC
    ++ Brother derby! This is exactly the kind of analysis that needs to be done.

    The most important thing in the vast majority of coding projects is maintainability, or "How easily can someone who's never seen this code before go in and make a change with a reasonable certainty that nothing else was adversely affected?" The answer to that question determines the code's maintainability.

    Run-time analysis is, in my mind, overused. 99.999% of all coding projects do not have a run-time constraint. Yeah, we all want our code to finish NOW, but if it takes 2 seconds instead of 1.8 seconds, that's ok. If adding .2 seconds to its runtime allows a programmer to safely make a change in 5 minutes instead of 2 hours, that's worth it.

    Memory constraints should be treated the same way. Most machines are much more powerful hardware-wise than can be used by a reasonable user. Do the right thing for yourself. It's only a machine.

    ------
    We are the carpenters and bricklayers of the Information Age.

    Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

      Maintainability is exactly the issue here. My architect freind says break it up so each piece is small and easily digestible w.r.t. function, even if this means calling multiple scripts to do the "job". My view is to have it all in one place so I don't have to chase around multiple files/locations to figure out what the job is doing.

      I don't think the two are completely at odds. Even though my script ended up longer than I expected, I did:
      - use subroutines to avoid repeating the same constructs
      - commented liberally, since I and my admins are all relatively new to perl
      - wrote a pretty comprehensive perldoc, that describes what the control file looks like and how to call the script with the control file and the available args.