Ralesk has asked for the wisdom of the Perl Monks concerning the following question:

Fellow Monks,

I have a module that I’d like to split into smaller parts. The organisation of the software currently is (using generic names): frontend.cgi (an FCGI app actually, could be anything else), AcmeAPI.pm (the module I want to split), and a few other pms unrelated to this module but used by the front end. The modules are not OO, there’s no need for objects right now.

So, frontend.cgi is basically just a router, it works with the received data, calls AcmeAPI’s functions as appropriate, and returns the results in the requested format. AcmeAPI has many facets and a few utility functions common to these facets. As I expect these functions to grow, I want to split things up into AcmeAPI::This, AcmeAPI::That etc. — that part is simple, just throw the appropriately named pm files in the appropriate subdir, and all is done.

My problems are the following:

Replies are listed 'Best First'.
Re: Package/module organisation question
by locked_user sundialsvc4 (Abbot) on Jun 13, 2012 at 17:42 UTC

    Let me contribute just a few more thoughts about your AcmeAPI.pm.   It is not quite clear from your description whether this is indeed an API, say to an external .so or .dll library, or just “a stinkin’ hunk-o’ legacy code.”   (Don’t be alarmed... we all have one.)   Unless you want a big fat chunk of code to monolithically be inserted into, not only your CGI programs but everything else as well, then you’re going to have to develop a strategy for splitting it up ... CGI or no.

    Most of the time, when tasked to do this, I first of all decided to leave the original file-name alone, because I could not know how many other routines might refer to it.   But I transformed it into a “stub” module that used the component chunks into which i broke it.   This facilitated source-code compatibility.

    I then broke out the source-code into reasonable subdivisions, building a module structure.   Initially, I had each one of these use the original name (now a stub), so that the end-result was, at least at first, that “everything got sucked in at once.”   But then, later on down the road, I could start replacing those with more specific uses.

    It is extremely important to use source-code version control!   (I use git since it requires no server, and because it integrates directly into the Eclipse editor that I also use.)   You need to first check-in the source code exactly as it is, then make a dummy branch (which will not change) representing it, and then another separate branch into which you will begin to check-in your various revisions as you systematically make them.   Check-in every significant change as soon as you make it, and keep detailed notes right there in the log.   In this way, every change that you (or anyone else on your team) makes, is both documented (both in substance and in intention), and reversible at will.   (Yep, nothing beats a good wayback machine ...)

      It’s thankfully just Perl-only, and mostly a sane interface to do database queries. It could happen that not just the web front-end will use the functions in it, but other, related applications as well — though right now, it’s a web-only thing. Speaks both JSON and XML (oh the pain, that…), the front end is mostly RESTful and clean, and the reason we haven’t jumped the PSGI boat yet is simply the lack of familiarity with the new stuff + lack of time. I do have it in my mind to make a fork of it and do it the Plack way (or maybe even use Dancer to give me the routing-fu), but for the amount of (internal) traffic this is going to receive, our usual Apache + FCGI combo is more than enough.

      Yeah, legacy stuff. Hard to switch from things you already have configured way better than you’d do with something new in the next half year…

      So back to AcmeAPI.pm. I had had a rather short time frame to work on it, and when I’m in such a situation, I tend to keep things in as few files as possible. Wrote extensive tests to make sure things happen the way they are supposed to happen, but kept the file organisation issue to a minimum. So it’s one monolith (and another thing that does (de)serialisation for XML and JSON — to/from the same structure the API functions expect it.

      RPC::Any that you mentioned in the other post — nice catch, it is an RPC, more than an API. Weird naming.

      So as the pm itself is not legacy (just some of the technologies involved are), and it’s only yet used by the frontend CGI, I can freely do anything to the backend, as long as I keep the frontend’s interface and the functionality (same URL with same data doing the same things) the same.

      Also, we do use VCS, we have switched to Mercurial from SVN a good while ago and we definitely use it all the time. No worries there.

Re: Package/module organisation question
by locked_user sundialsvc4 (Abbot) on Jun 13, 2012 at 12:52 UTC

    The approach that is taken by, e.g., RPC::Any might be useful to the overall organization of an application like this one; ditto UNIVERSAL::require.   It is often useful that the various components of a large application should be “demand loaded” into memory only when needed, and these packages illustrate working implementations of this ... and, in the case of RPC::Any, also present you with a simple way to distinguish between what is externally callable (by a web request, say), and what is not.   I have used these two modules personally with great success.

    The use directive is used to incorporate any packages that may be needed “all the time” by any package.   Anything that a demand-loaded package uses, if not in memory already, will thereby be incorporated (once...).   If already present, nothing else happens.

    In my experience, most such applications naturally divide themselves (besides the core...) into several concerns, such as “request handlers,” “objects,” and “common utilities.”   Of these three, request handlers are what is usually dynamically loaded.   Everything else simply takes care of itself.   I usually deploy with FastCGI (Plack), and arrange for the request handlers to “commit hara-kiri” (this actually is the term that Plack uses...) after a few thousand requests have been served, so that memory is periodically cleaned-up by the operating system.   I highly recommend that you take a very close look at Plack.