in reply to Structuring code

It looks like you have one registry that maps from types to dirs and texts, and to urls and regexes. Then you have different functions that operate on particular subkeys for particular types. ("Type" is a name you are using; in more abstract notation it is just another key that happens to be top-level.)

This suggests a few things.

  1. There is one registry. Make it global.
  2. You have business-logic functions that do particular things on one type at a time. So these functions should get the type (probably by name -- that's easier), and know about the registry. Since we are making the registry global (or if this is a package, at least a lexical with the same scope as the package) you don't need to explicitly pass a reference to the registry.
  3. You need generic functions for manipulating the registry too. You'd mentioned adding a new type; maybe that's all you need right now but this approach will also let you cleanly write code to delete a type, clone it, and so on. Implement these as you need them, but the trick is--
  4. What is referred to by a type always has the same structure. This suggests a class (though you don't have to go OO all the way if you don't want to). You then write a "loader" function to translate from data to a series of objects that will be less bulky (visually, at least) than writing out the code explicitly, inline. The big win here is that you can also pull out the data from the code completely and put it in a configuration file (or a database, or whatever). When you update the structure of a type, you have to update this loader, but now you have an encapsulated object which everybody accesses through a well-defined interface, so they don't have to change their code.

Hope this helps!

Replies are listed 'Best First'.
Re^2: Structuring code
by bobf (Monsignor) on Jan 06, 2005 at 09:37 UTC

    Thanks for your comments, gaal. It sounds like you're leaning towards one of my first ideas ("pulling all of the %data hashes out and combining them [into a global hash], then passing a hashref ($data{$type})" (even if it is implemented with an OO approach).

    1. There is one registry. Make it global.
    This was my first inclination, but I was torn between that and keeping the data elements localized to the sub they are used in. I thought it would be easier to understand the code if the data was right there next to it, but maybe I'm better off keeping all the data together.

    2. ...these functions should get the type (probably by name -- that's easier), and know about the registry.
    Currently, the names of the "types" are the top-level keys to all the hashes, and that is what I pass as a parameter. If I combined all the data hashes per comment #1, I could pass $data{$type} instead. This would limit the portion of %data that is accessible by the functions to only the entries under the specified $type key, and it would eliminate the need to pass either $type or a ref to %data (although the latter would not be necessary if it were a global, as you point out).

    3. You need generic functions for manipulating the registry too. You'd mentioned adding a new type;
    I wasn't very clear about this in the OP - sorry. The different "types" (top-level keys) actually represent different data files, and the functions are used to manipulate and search through them. It is possible that new data files could be added to the list, but that would be a very rare occurance and I assumed it would be easier just to add some entries to the %data hash(es) rather than develop a set of functions to do it for me. You raise a good point, though - I'll give that some thought.

    4. This suggests a class (though you don't have to go OO all the way if you don't want to).
    I started thinking about using objects (and their advantages), but I'm still very new at OO programming and I haven't totally thought out how I could/would/should structure things. Nonetheless, I do like the idea of having the guts encapsulated behind a defined interface. I think I'll have to spend more time considering this approach.

    Thanks again for the comments. It's nice to have a few different ideas to consider.

      If you go the OO way, you (usually) *wouldn't* write your functions to get a type by name. Rather, your registry would simply become a very simple (conceptually speaking) map from names to objects; if the calling code already has the reference to the object it is working on, it simply invokes what is now a proper method on that object. The registry gives you: (a) a convenient way to get the object if you only have its name, and (b) a reference to the object that lives as long as the program does.

      If separating different parts of each object (/ different aspects of each type) looks like a concern to you, then perhaps you need to deepen you object model (that's what you do in OO in these cases). So you say each "type" is really a disk file? Then you want a relatively thin File class, that is composed of further objects: FileLocation (directory, text, what you had in your first function), and FileURL (the things you had in your second function). I don't really know if my names make sense: you're the one who knows the business logic -- why you have regexps and urls treated together, for example.