melguin has asked for the wisdom of the Perl Monks concerning the following question:

I'm in the midst of working on a recipe program in Perl with some other people accross the globe (so far France, Italy, Russia, and Brazil, in addition to the U.S. & Canada) via SourceForge. With the added help, the project is progressing nicely, but I find that I'm spending lots of time reviewing code. This is needed, I think, but as the program file grows (just under 1500 lines, but increasing fast with each feature) it takes more and more time to go over.

I was thinking of splitting it up into the main program(s) and having lots of modules. That way only one or two files will be affected each time.

Is my thinking sound, or am I missing some important issues (on both sides)? If it's a good idea, what is the best way to logically devide the code? Also, can anyone give advice (and pointers to info) on working on/leading projects in this manner?

melguin.

Replies are listed 'Best First'.
Re: hacking a project in groups
by tachyon (Chancellor) on Sep 01, 2001 at 15:27 UTC

    What you are talking about is in essence Object Oriented Design. Forget all the details, OO is about abstraction. Here is a basic OO program.

    interface [MAIN Program] <---------> [DATA Module] <---------> data (abstraction layer)

    This program consists of two logical chunks. MAIN and DATA. MAIN does the stuff and DATA handles the storage/retrevial of the data. All communication between MAIN and its data is via an interface supplied by DATA. An interface is simply a series of methods (subs) that DATA makes available to MAIN. Let's say that this interface consists of two methods:

    get_data_as_array() set_data_from_array()

    The MAIN program saves or gets its data only via these two methods supplied by DATA. This is called abstraction and the DATA Module is described as an abstraction layer. The abstraction layer serves to separate the program from the storage of its data. How DATA handles getting the data as an array via the get_data_as_array() method or setting the data as an array via the set_data_from_array() method is its own business. DATA may use a flatfile. It might use a RDBMS. MAIN does not need to know. Provided DATA continues to supply these two methods (and they work the same) it is free to do the actual data handling any way it wants. If DATA changes from using a flatfile to a RDBMS then MAIN is *not affected at all* provided the interface is maintained. The abstraction layer provided by the interface is what makes this possible.

    So the DATA module abstracts MAIN from the storage of its data making our program into two quasi independent units. The real world benefit is that one team can do MAIN and another DATA. Provided they agree on the interface, they are free to do it any way they want :-)

    Most programs have these sort of logical chunks:

    MAIN - to tie it all together DATA - to handle data storage and retreival MUNGE - to do fancy stuff to the data USER_INTERFACE - to make it pretty WIDGET - to do all the stuff we forgot elsewhere :-)

    So in short you need to sit down, look at the logical chunks, design the interfaces (you can add methods to your hearts content later, you just can't delete or change existing ones) and write a spec.

    You also need to apply version control. There is nothing worse than fixing a bug in a chunk of software only to discover you just updated the old version and now have to transfer all the work to the current widget and redo all the testing. There is heaps of stuff on this but that's the exec summary. Hope it helped.

    cheers

    tachyon

    s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Re: hacking a project in groups
by maverick (Curate) on Sep 01, 2001 at 23:33 UTC
    I am in *complete* agreement with tachyon on this. If there's going to be more than 2 people working on this, the modularizing this is very helpful. If this expands to 5 or more, then it's going to be about the only way you can keep your sanity.

    Think about your design. Draw out the major modules on paper. Think about the design again. Confer with the other people on the design. Then, delegate control and responsibility. Tell Joe that the Foo subsystem is his and let him oversee it, tell Bob that the Bar one is his, etc. Then you just watch over it all and insure that everything runs smoothly.

    Most every large project works this way. The Linux kernel, Mozilla, Gaim, these are just a few that I can name off the top of my head.

    /\/\averick
    perl -l -e "eval pack('h*','072796e6470272f2c5f2c5166756279636b672');"

Re: hacking a project in groups
by Rudif (Hermit) on Sep 02, 2001 at 00:42 UTC
    Hi melguin

    I agree *totally and completely* with everything that tachyon and maverick said.

    If you agree, too, your question might be: where do I start?

    I would suggest that you read up on eXtreme Programming , especially about code refactoring and unit testing. Also search the PM site here for eXtreme Programming.

    Your code saute_0_0_2 looks neat and well organized, so it should not be difficult to morph it.

    Let me try to suggest a recipe (untested) for putting the OO advice into the practice.

    You have 2 perl files -
    dbsetup.pl, 152 lines - uses DBI - creates the database schema
    saute.pl, 1435 lines - uses Gnome and DBI - main program, creates Gnome main window and accessories, lets the user view and edit recipes

    If I were you, I would start with making a module (a.k.a. package a.k.a. class) that wraps all your interactions with the database. This would separate your data storage and maintenance from your GUI application, and let you develop each separately, perhaps by different participants.

    In particular, all subs (or their inner code chunks) in the main program that interact with the db should become methods of this module.

    Step 1

    Create a module file RecipeDb.pm containing this

    #!/usr/bin/perl -w use strict; package RecipeDb; sub new { my ($class, %args) = @_; my $self = { dbname => $args{dbname}, dbuser => $args{dbuser}, dbpasswd => $args{dbpasswd}, # more options if needed ... #workspace dbh => undef, }; bless $self, $class; $self->{dbh} = DBI->connect("DBI:mysql:$self->{dbname}","$self->{d +buser}","$self->{dbpasswd}", { PrintError => 1, RaiseError => 0, }) || die "Can't connect to $self->{datasource}: $DBI::errstr"; return $self; } sub DESTROY { $self->{dbh}->disconnect(); } # your code from saute.pl, slightly modified sub get_recipe_info { my ($self, $recipeID)=@_; # notice $self my %recipe; #get main recipe parts my $sth = $self{dbh}->prepare("SELECT name,descr,instruct, preptime,notes,source FROM recipes WHERE PriKey=$recipeID"); # notice $self{dbh} $sth->execute(); while (my @row = $sth->fetchrow_array) { $recipe{"recipeID"}=$recipeID; $recipe{"name"}=$row[0]; $recipe{"descr"}=$row[1]; $recipe{"instruct"}=$row[2]; $recipe{"preptime"}=$row[3]; $recipe{"notes"}=$row[4]; $recipe{"source"}=$row[5]; } $sth->finish(); #get ingredients my @ingredients_list; $sth = $self{dbh}->prepare("SELECT recipeIngredients.quantity,units.name, ingredients.name FROM recipeIngredients,ingredients,units WHERE (recipeIngredients.recipe=$recipeID) AND (ingredients.PriKey=recipeIngredients.ingredient) AND (units.PriKey=recipeIngredients.units)"); # notic +e $self{dbh} $sth->execute(); while (my @row = $sth->fetchrow_array) { push (@ingredients_list, [$row[0],$row[1],$row[2]]); } $sth->finish(); $recipe{"ingredients"}=\@ingredients_list; #get categories my @categories_list; $sth = $self{dbh}->prepare("SELECT categories.name FROM recipeCategories,categories WHERE (recipeCategories.recipe=$recipeID) AND (categories.PriKey=recipeCategories.category)"); +# notice $self{dbh} $sth->execute(); while (my @row = $sth->fetchrow_array) { push (@categories_list, $row[0]); } $sth->finish(); $recipe{"categories"}=\@categories_list; return \%recipe; } # more methods here 1; # true enough

    Run this file to make sure that it compiles.

    Step 2

    Write a little test program, say TestRecipeDb.pl

    use RecipeDb; my $rdb = new RecipeDbase( dbname => $DBNAME, dbuser => $DBUSER, dbpasswd => $DBPASSWD, ); my $ID = ""; # please fill in my $recipe = $rdb->get_recipe_info($ID); Use Data::Dumper; print Dumper $recipe;

    Does it work? Is the dumped recipe what you expected? If not, fix it.

    Step 3

    Now copy the code from TestRecipeDb.pl into your main program, saute.pl.

    Replace all calls to  get_recipe_info() by calls to  $rdb->get_recipe_info() .

    Test thoroughly. Does it work as before? If not, fix it.

    OK? Now remove  sub get_recipe_info() from saute.pl.

    Step 4..N

    Repeat for all other subs that interact with the database what you did for  get_recipe_info() : make a clone in RecipeDb, test it, call it from main program, remove the original sub from the main program. When done, remove  use DBI; from the main program.

    What next?

    Above should leave you with a considerably smaller saute.pl, a module RecipeDb.pm and a test program for it. And enough OO experience to take another hard look at what you have and decide what would be the next useful steps.

    You should probably move some of the code from dbsetup.pl into RecipeDb.pm, and leave dbsetup.pl as a command-line interface to creating the database. If some day you decide to provide a graphics interface for this activity, you'll be ready.

    Next, turn those recipe hashes into packages, perhaps?

    You can turn the test program into a regression test by hardcoding the expected results and comparing with what you get from the database.

    Oh, I almost forgot the most important advice: enjoy!

    Rudif