trs80 has asked for the wisdom of the Perl Monks concerning the following question:

I am working on a Kodak CD processor utility to make some use out this mountain of images my wife has built up for me. I have around 100 CD's and I decided I had better write a Perl utility to process them into a more useful block of information. I looked on CPAN and here, but didn't really find much. I am working with some older ( year 2000) Kodak CD's so possibly the format has changed, but I am trying to make everything as generic as possible. So to that end I am seeking some wisdom on how to best provide an interface to some of the methods.

I read the INFO.CD file and get the OrderId and use that for my reference point since the CD is stamped with this it gives some point to refer to in the future if you want to know where the original was. I then get a list of the photos from the disk. This is where my first question lies. I had been saving the images to the HD for processing and archiving, but that lead to redundant archiving since the CD should be the archive. This was also eating up a lot of HD space so I just stored the list and always goes back to the source to do any modifications. Would this be useful as a configuration option, the storing of the files to HD versus going to source that is?

Once I get the list I then process each image with Image::Magick and convert the images to presently predefined sizes. Which brings me to my second question. What would be a good generic interface to this? I am presently toying with this:
sub resize_image { my ($self,$file) = @_; my($image, $x); print "Creating image " . $self->percent . "% the size of $file\n" +; $image = Image::Magick->new; $x = $image->Read($self->source_directory . "/$file"); warn "$x" if "$x"; $x = $image->Resize('geometry' => $self->percent ."%" ); warn "$x" if "$x"; $x = $image->Write($self->output_directory . "/" . $self->prefix . + "$file"); warn $x if $x; } sub make_various_sizes { my ($self,$size_list) = @_; =pod # size list example $size_list = [ { prefix => 'small_', percent => '15' }, { prefix => 'medium_', percent => '50' }, ]; =cut foreach my $size (@{$size_list}) { $self->prefix( $size->{'prefix'} ); $self->percent( $size->{'percent'} ); $self->resize_images(); } } sub resize_images { my ($self) = @_; foreach my $file (@{ $self->{'list'} }) { $self->resize_image($file); } }


It isn't flexible enough though in my opinion if I add it here in one of the code areas or upload it to CPAN. It should provide for, in my opinion, at least the following: Other methods currently store the original image names in an XML configuration file via the XML::Simple module. In the works are XML configuration files that store additional information about the images and at a higher directory level the CD itself. I have rough HTML generators, but I think I can do away with and let the user process the XML or perhaps tie in with some templating module. I don't have much familiarity with the templating modules so that would most likely be a long way off.

I am going to be adding a web interface utility to this to allow for working with existing image stores so any suggestions concerning how people would/are interact(ing) with them currently that might be good to have in here would be help. As for the web interface right now my wife has requested: What other features might be useful in the core module in general?
What might be a good name for this module?
What modules can be recommended to handle some of this as well so I don't reinvent the wheel? I looked at some of the existing ones and most of them require creation of the Caption, Titles, etc. in text files by hand is there one that allows for a web interface creation and edit process?

If what I have started above is already in a module somewhere please point me in the right direction there as well.

While working on this I also created a nice little email image (well any or all attachments really) extractor that works with mbox style mail stores. I used it to parse my wife's Win32 Netscape mail. I will be posting it here once the next release of Mail::MailboxParser since it should have some Win32 issues corrected in it. Is this something that should be included in what I described above, or should it be a seperate module (tree)?

Replies are listed 'Best First'.
Re: Object Interface and Module Question
by mstone (Deacon) on Feb 21, 2002 at 21:12 UTC

    First off, you need to slow down. What you have is a wish list, not a module specification. That's not a bad thing -- the wish list marks out where you want to go -- but one should never mistake compiling a wish list for designing code. A good wish list is open-ended, and can easily contain items that won't work together. What's more, wishes don't translate to code very well. They're fuzzy. Either they leave out critical details -- "we want a code optimizer that skips lines that will never execute" (small problem: it's impossible) -- or they specify details that may or may not be necessary -- "... then the hit counter pulls the number of hits from a MySQL database ..."

    Programming is the art of thinking clearly, so the best place to start is by defining your terms. For OOP, that means breaking your proposed system down into Entities, Attributes, and Relationships.

    • An Entity is anything you'll want to treat as a distinct unit.
    • Attributes are qualities that make one Entity different from another.
    • Relationships define how Entities talk to each other.

    In general, it's useful to start with Entities that match things in the system you're trying to model. The fastest way to do that is walk through your wish list and extract all the nouns.

    You have a collection of CDs that contain images, so the most natural starting place is to define Collection, CD, and Image Entities, then start assigning Attributes and Relationships. Each CD has an ID number -- attribute. CDs contain Images -- relationship. Each Image has an ID that identifies it uniquely on the CD -- attribute. Each image has binary data that programs can use to paint pixels on the screen -- attribute.

    The data itself is formatted, and that looks like a sub-attribute. So it might be worthwhile to create an Image_data class just for that package of information. Then the Image_data object can have a 'format' attribute. You could also give it a 'location' attribute, which would solve your problem with storing images on CD or HD:

    sub Image_data::location { my $O = shift; my $location = $O->local_storage() || $O->default_storage(); return ($location); } sub Image_data::local_storage { =pod returns the path to a file on HD if the image is stored locally, and a null string if the image is not stored locally. =cut } sub Image_data::default_storage { =pod combines ID info from the parent Image and CD objects to create a filepath for the image on CD. this routine must *always* return a valid filepath. =cut }

    It's also helpful to examine think about your classes in terms of Roles and Responsibilities. These are really just different ways of looking at Relationships, but Roles and Responsibilities help you assign scope to each class. Ask yourself, "what does this class do?" and "what should this class delegate to somebody else?" and you'll avoid hodge-podge classes that do everything under the sun. If you can't answer either question for a given class, it's a sign that you need to spend some time thinking about that class's scope.

    Finally, it's easiest to design programs back-to-front. Decide what output you want, then figure out what input you need to generate that output. Then figure out where those inputs come from, and whether they're output from some other part of the system. Eventually, you'll chase everything back to configuration variables hardwired into the program that let you load further information from the drive.

    So -- freeze the wish list for a while, and break your ideas down into Entities, Attributes, and Relationships. That will partition your system, and the module interfaces will flow naturally from there.

      Thank you for the detailed reply, despite my horribly assembled post.

      I *really* do want a wish list first.

      My wife has been very patient about out photos and since I am currently not working she thinks this is a good time to get caught up with things around the house, our photos being one of them. I am going to put together something for her as a prototype and then if it is well received and I am pleased with it I will invest more time into the actual requirements vs. wish list at which point your advice will be very helpful.

        I'm not knocking wish lists. They are where all software starts. It's just easy to fall down the rabbit hole and keep adding stuff to the wish list until it becomes a snarled mess you can't possibly turn into code.

        Trust me, I'm speaking from experience on this one.. ;-)

        It's also cool to need something ASAFP because The Boss (and a wife definitely counts) wants to see progress. But when speed counts, you need to be very careful about limiting the scope of your work. Nothing can kill a schedule quite the way feature creep can.

        One of the most useful programming axioms I've ever tattooed inside my eyelids is: anything is better than nothing. It sounds like a tautology, but having worked many projects in many settings, I've found that even the ugliest kludge will take you farther and faster than an empty source file and a bunch of grandiose dreams.

        Dreaming is easy, and programming is hard. Therefore, dreaming (usually called: 'planning', 'collecting requirements', 'designing', etc) is a great way to escape the ugly reality of having to write actual code. Not that I'm against planning and design -- far from it -- but if you don't know how to turn what you're doing into code, you're not planning or designing. You're daydreaming. Planning and design are every bit as hard as programming itself, and the only way to tell whether you've fallen into random speculation is to try turning the work into code.

        If you can't even reduce your ideas to a set of nested print() functions:

        sub do_something { print "now we want to do this.\n"; step_1(); step_2(); step_3(); } sub step_1 { print " handle this part of the job.\n"; } sub step_2 { print " handle the next part of the job.\n"; } sub step_3 { print " handle some other part of the job.\n"; } do_something();
        what you're really contemplating is your navel.

        People fall into the trap of pseudo-planning because they feel overwhelmed. They have a great big wish list, and writing minimal prototypes just seems.. unworthy. So they try to turn the wish list into a map of the final product, and busy-wait themselves into oblivion.

        Don't try to plan software all in one go -- that's not how we build the stuff. Piet Hein, mathematician and all-around genius, wrote little poems called 'grooks' as a hobby. Donald E. Knuth, arguably the world's foremost source of programming wisdom, quotes one of them frequently as the essential model for software development:

        "The road to wisdom?   Well, its plain and simple to express;   Its err, and err, and err again,   but less, and less, and less."

        If you need to be fast, zero in on the smallest number of things your program can possibly get away with doing. Make those work, then start thinking about other things you can add. Get used to iterating through the cycle:

        • make it work
        • make it correct
        • make it fast

        where each 'make it correct' pass gives you enough foundation to keep the next 'make it work' pass from becoming a nightmare. Leave the 'make it fast' stuff for last, because few things screw up a program worse than 'optimizing' it before the part you're tweaking has been set in stone long enough to acquire a few layers of pigeon crap.

        That cycle will zero in on usable, well-built code much faster than top-down design based on speculation instead of code.