jeorgen has asked for the wisdom of the Perl Monks concerning the following question:

Dear fellow monks,
I have a question to you about object-oriented design in perl.

I'm coding a log analyser. It works, but I wonder if I have made the correct object-oriented design decisions.

Maybe this question would fit better as a meditation, but since I'm right in the middle of coding it is actually of practical importance to me.

I'll start with two concrete examples and then follow with the more general question.

There is a class called "Category" that is instantiated into objects. Each object holds a pattern to match, a title for the category, a data structure for holding the statistics on the category, methods for printing out a form interface to use the category in the first place and print methods to print out statistics on the category. This makes for a self-contained object that holds data for the category, and methods for handling it in all respects.

Now, when the object does its magic it starts from the high level:

A method "process event" is called first, The "process event" method in its turn calls the "match_line" method of the object, then calls the "store_event" method. These two methods in their turn, trigger other, more low level methods in the object.

If I start looking at it from a distance it seems as if the methods can be classified as being at higher and lower levels.

In case of a class for object persistence that I wrote some time ago there seemed to be three levels: one high-level which works as a kind of interface ("$object->save"), an intermediate level that acts like a switchboard for choosing implementation methods, and an implementation level that actually does something (e.g. save to flat file system or RDBMS with certain attributes).

I know that in the world of object orientation, there are all kinds of concepts and patterns swirling around: adapters, delegation, inner classes, etcetera. So my question is: What is a good solution to this problem with methods on higher and lower levels? Is the design with all levels in the same class fundamentally wrong? How would you do it?

The goals I want to achieve are the following:

Thanks in advance,

/jeorgen

  • Comment on High-level methods to low-level: Do I put them together?

Replies are listed 'Best First'.
(chromatic) RE: High-level methods to low-level: Do I put them together?
by chromatic (Archbishop) on Jul 15, 2000 at 06:23 UTC
    I've done a bit of thinking on that very issue for Jellybean, as a matter of fact. Our design will probably end up using interface-type code and helper objects.

    Take the DBI and DBD architecture, for example. If you want to store your data in a database, write to the DBI interface, and let people use whichever DBD matches their database of preference. I say, write your high-level objects generically enough that they can use any helper objects with the right interface. (I guess that means, make polymorphic helper objects.)

    That means that you can just flip the switch of changing which specific helper object to use, rather than recoding bits and pieces of your main object. If you're good, you can even pass a reference to your preferred helper to the main object constructor.

      Thanks, chromatic for your reply. I have gone through your answer a couple of times and had the opportunity to discuss with my colleague Uno, who is a Java programmer ( us two together is pretty much a company meeting :-). I'm starting to grok what he's saying now. The DBI/DBD example you gave is also elucidating.

      I am a systems analyst by training, but just before they started teaching object oriented analysis and design, so I missed out on that part. One thing I did learn was though that it's good if an application you make can cope with change.

      So now over to the question at hand: High-level methods to low-level: Do I put them together?

      • It seems OK to mix levels of methods in a class. As long as you don't see any problems with a changing environment, that will force you making you recode a little bit here and a little bit there.
      • It is not so smart to mix in lower levels of methods in a class, if you expect them to change depending on the deployment or implementation of the application. In that case it's better to factor them out so you can do the change in one place. You could even change what classes are used at run time by making a factory. More about factories further down.
      • If somebody else has made a good class (e.g. at CPAN) that does a lot of the stuff you could probably cajole it into working in your application by putting some kind of wrapper code around it to make it fit with your application. In object parlance I've lerned now this is called to make an adapter.

      Uno managed to dig up a URL to some web pages that describes the patterns in the "Gamma" book, which seems to be a kind of bible about patterns in OOD.

      One pattern is called factory, and it seems to be constructor that can decide what kind of object to construct at run-time. I think this is similar to what you call the main object constructor. I suppose that instead of a parameter it could check a configuration file (or object) in order to decide what kind of object to return, e.g. "MySQL is true". It could also just try (e.g. with eval) to create objects of different classes according to an ordered list. I suppose this is the way the AnyDBM module at CPAN works.

      The adapter pattern is a way to connect an object to another object, where certain method and attribute names are expected in the communication. The adapter object then goes in between to translate. So it's an interface translator, though perl doesn't have explicit interfaces as does java. It could be used to cajole in a class from CPAN in a place it didn't expect :-)

      I didn't find any info on helper objects and polymorphic objects, but I guess that you would create these and store them by reference in a slot in e.g. the Category object?

      Going back to the practical case of my log analyser, it seems unnecessary to factor out the match_line or store_event methods etcetera, that would just create a lot of small classes. There is a data structure called "Tree" that could be made into some kind of general tree-storage class, but will probably not be worth the effort.

      /jeorgen

Re: High-level methods to low-level: Do I put them together?
by Shoeboy (Sexton) on Jul 16, 2000 at 23:55 UTC
    Is the design with all levels in the same class fundamentally wrong?

    That depends. Will doing it this way substantially increase the effort required to support different types of logs? Are you likely to need to support more log types in the future? Does your current solution contain lots of redundant code that just screams for abstraction and encapsulation? The answers to these questions are the answers to your question.

    I do ~80% of my coding in languages that don't support OOP (Transact SQL and C) and the remaining 20% in perl where OOP is optional (albeit frequently useful)-- so I find the idea of there being 'right' and 'wrong' levels of OOP clutter to be ludicrous. It's a judgement call. There's more than one way to do it. Pick the one that works best for you.

    I slept through the only class in software design I ever took, so feel free to ignore me.

    Shoeboy
    perl -e "do {kill $java, $ada, $cobol, $pascal, $csh;} until die 'Just another perl hacker';"
Re: High-level methods to low-level: Do I put them together?
by moo (Acolyte) on Jul 15, 2000 at 23:04 UTC
    First, if the code works, that's all that matters ;)
    
    Writing this out in text is very difficult for me;  a 
    picture is far more useful.  Furthermore, some of my terminology
    has been "personalized," so email me for an explanation of
    something that you think is weird.
    
    This post really answers your third question: "How would you do it?"
    
    However, based on your description, it seems to me that the
    program does not have any sort of generic definition of a 
    log.  You have a series of log types, but no generic log.  In
    this model, a log shall simply be an aggregation of data 
    elements of differing data types ( build a log from components 
    that any log may have ).  Right now you have specific logs
    with specific components.  Do any of these components overlap?
    
    Each log will be of its own type.  This means that your 
    program will be asking questions of the logs by log type.  
    
    Example:   You have a 5 log instances bunched together in 
    the queue.  Your base class is called Logger.
    
    At the correct time, Logger asks:  Who is an http log?
    2 logs raise their hands.  Logger hands them off to 
    Lumberjack_1 and forgets about them.  
    
    Logger then asks:  Who is a wombat log?  3 logs raise their
    hands and they get routed to Lumberjack_2 and forgets 
    about them.  
    
    The Lumberjack_X methods could be in the base class or 
    in Log_types, depending on what you want to do.  I would put
    them in the base class Logger.
    
    In each case, Logger is asking a question of the log.  Once
    an answer is received, an action occurs.
    
    The basic definition of the log will be in your base class.
    All elements common to all logs live here.  Furthermore,
    the base class will contain all common methods.  
    
    All deltas will be in derived classes, as you well know.
    The idea here is that Logger does not know about the different
    types of logs.  In order to get that info, it has to get
    it from Log_types.  Log_types doesn't know much about
    log components, so it defers to the to wombat
    or http to get the correct object with the correct data.
    
    The key is to defer from generic to specific.
    
    The notion of virtual base class will probably help out
    here.  In this case, base class Logger will have a virtual
    base class called Log_types.  Underneath Log_types will
    be the various types of logs:  Http and Wombat.
    
    But what does Logger have besides Log_types?  It also
    has a Report_interface.  Report_ifc could then have
    Gui_ifc and Text_ifc.  Report_ifc would be another
    virtual base class.
    
    The 'Has-a' relationship between Logger and Log_types
    for making an instance can look something like this:
    
    Option 1:
    
    in your base class:
    
    $log -> make_log( 'wombat' ); # wombat is-a type of log
    
    in your derived class:
    
    my ( $log, $log_class_name ) = @_;
    my $message = $log_class_name -> new(); # run the correct new()
    $log -> logs( $message );  # this is Has-a in action
    return $log;
    
    Notice how the instance, $message, is placed within the 
    context of $log. 
    
    
    Option 2:
    
    in your base class:
    
    $log -> add_log( 'wombat_log', 'log_details' );
    
    in your derived class:
    
    my ( $log, $log_name, $log_data ) = @_; 
    my $message = $log -> logs() # this gets data AND renames
    # the instance
    
    $message -> add( $log_name, $log_data ); # adds data to instance by
    # going to the correct 'add' method
    
    return $log; 
    
    
    Programming Perl has a small section on 'Has-a' on page 318.
    
    When you are building this, just get it to work for one
    simple log ( like the HTTP log or something ).  
    
    I hope this helps.
    
    --moo
    
      Thanks moo for the thinking about the log analyser code. I develop it for a customer that agreed to make it Open Source so it is available at sourceforge, with some docs too on what the objects do. The name of the analyser is a bit weird, because I didn't want to put it any of the CPAN namespaces. It's in beta, with a lot of out of date comments and other weird stuff; it's probably not of any use for others yet.

      Regarding making the log analyser general, I only intended it for web server logs, however it is built to make statistics of any log file. The only assumption about what kind of log it should parse is that it should consist of lines (Monday correction: it assumes there is a date pattern too of a particular format). The line assumption could easily be taken away by changing the LogFile object. Categories are defined as a pattern and a title for the category, so you should be able to match anyting that is, er, matchable. You can also put in some code for each category (e.g. the Query module unescapes the query string, before storing statistics). Next step will be to put in a slot for a private data structure that can be used by an object to have a memory between lines, for recording sessions.

      A lot of other stuff is hard coded, e.g. HTML.

      /jeorgen