Venerable Monks. I pose the following as a meditation as it is something that I often ponder.

To use or not to use !? That is the question!

I often find myself needing a piece of code to perform a certain function, as we all do. CPAN is filled with modules to reference and glean from the experience and work of others. So, my cogitation is such, should one call upon a module to perform a function when only a small fraction of the module's worth is actually being used?

My feelings are such that if we intend to only use less than 50% of what the module offers we are better writing the function our selves. If we are going to use or plan to use in the future more of what the module offers then call the module.

Now, my reasoning for this. I firmly believe that writing very fast, tight, robust, and streamlined code is of great benefit not only to the users, the system as far as unneeded overhead, and for the programming community as a whole is something of value. Ergo, my disposition to only using an entire module if the functions are needed. For future development, and we all know that happens, including the module will become of benefit as it provides a framework upon which to grow.

So, should one use a module that is obviously "heavy" for the task simply to make life easier? Or should one take the extra analystical steps to make sure that inclusion of the module makes sense in any given case?

I'm constantly reminded by the growing idealogy among promgrammers that it doesn't matter if the code is a bit bloated - the hardware is "beefy" enough to overcome any sluggishness in code. Is that laziness on the part of the programmer? For me it is - I would be hard pressed to write/release a program which I know is larger than it needs to be (not obfuscating for maintainability) regardless of the hardware what will run the program.

Thoughts?

Regards,

Draconis

Replies are listed 'Best First'.
Re: To USE or not to USE
by gmax (Abbot) on Jul 22, 2003 at 14:25 UTC
    My feelings are such that if we intend to only use less than 50% of what the module offers we are better writing the function our selves.

    In this case, quantity doesn't matter.

    Let's say you need to use a database. The DBI has a few dozen methods, but you just need to use connect, selectall_arrayref, and disconnect.
    Would you write those methods yourself or would you use DBI?

    Think about what's behind those three methods, in terms of complexity, efficiency, and testing. I'm sure you don't want to go through such ordeal.

    That's just an example, but I could say the same for a dozen modules that I use on a daily basis.

    IMO, you should use what is well tested and established, and rewrite only if you have a really strong reason for that, such as buggy code and unresponsive author.

    _ _ _ _ (_|| | |(_|>< _|

      In addition, If you need to gut one part of your system for another (say, DBD::CSV -> DBD::MySQL -> DBD::Oracle), I would much rather be able to change (basically) my connect string, test, test, and test, rather than have to rewrite the guts of each of these.

      Sometimes the overhead is due to bolting a consistant interface on your module, so that it can replace / be replaced by something less / more appropriate.

      Good question, though.

Re: To USE or not to USE
by broquaint (Abbot) on Jul 22, 2003 at 14:10 UTC
    As with any situation where you're including external code in your production code base you have to evaluate what it contributes and what the pros and cons are of including said code. But from the perspective you're speaking (external code will slow down my code, no?) I personally don't think this is an issue. Most code from the likes of CPAN is heavily tested and will hopefully have thought of most of the things that you won't have if you were to have re-written the necessary code yourself. To re-write what you need will probably take far longer as a lot of the code on CPAN has had quite a few man hours behind them and are quite often written for a specific purpose by people who know the ins and outs of the problem space. The issue of speed is really a bit of a premature optimisation as generally the bloat of a module is the amount of code, which if you're only using a small amount is a non-issue as the only penalty you'll incur is the compile-time parsing. Of course this is not to say include every module, all and sundry, for every piece of code that could use a module, just to say that far more often than not the excuse of speed when including CPAN type code isn't an issue.
    HTH

    _________
    broquaint

Re: To USE or not to USE
by adrianh (Chancellor) on Jul 22, 2003 at 14:01 UTC

    In my experience programmer time is normally more expensive than any other development resource. So I'd use the module.

    Optimise when performance becomes an issue.

Re: To USE or not to USE
by Zaxo (Archbishop) on Jul 22, 2003 at 14:27 UTC

    It's rare that a successful program doesn't gather feature requests like flies. A module provides a guide and a tested library for a problem domain. Good ideas are likely to be supported.

    One of the biggest modules in perl is POSIX.pm, but it is commonly used for just a single sub from it. The weight of the module on your program is controlled by the import mechanism.

    You wouldn't reject using libc if you were calling less than half of it, would you? How did you get that figure?

    After Compline,
    Zaxo

      Zaxo,

      I got that number (50%) from no where really - just trying to isolate a standard, and/or determine if a standard is even really necessary.

      Everyone so far has made very valid points and I would most definetely not re-write portions of POSIX.pm or the DBI, etc. I think that would be a foolish thing to pursue since they work well.

      I was hoping to see what the "mindset" is among the monks and from the community in regard to this topic. I know from other languages this issue is an issue - and I am glad to see that with Perl that is not the case. *Just another reason to write in Perl !

Inadequate laziness, false hubris
by skyknight (Hermit) on Jul 22, 2003 at 17:45 UTC

    "We should forget about small efficiencies, about 97% of the time. Premature optimization is the root of all evil."
    -- Donald Knuth

    Your time as a programmer is very valuable, and if you're spending a substantial time re-inventing the wheel, you're not allocating your time optimally, either from your own perspective, your employer, or the community at large. If some library is broken, or inadquate, by all means create a better wheel, but make sure that your efforts are not gratuitous.

    I used to fall into this trap all the time, but I eventually realized the futility of it. Oft times a library that you can import from elsewhere will have already hashed out all of the subtleties and painfully slogged through the pitfalls. Stand on the shoulders of the creators instead of reliving their travails.

    Take care of the big picture first, and don't get bogged down in the minutia. Not only may you gain minimal performace boost, but your code may not even be functionally correct if it is a complex problem. You might not even fully comprehend the comlexity of the problem. For evidence of this, look no farther than the regex required to properly and exhaustively validate email addresses.

    Later in the project, when you've got your functionality down pat, shift your attention to the internals, going back and sussing out the bottlenecks. If you can isolate a particular library as a drag on your throughput, then and only then go to the trouble to rewrite it yourself.

Re: To USE or not to USE
by chromatic (Archbishop) on Jul 22, 2003 at 16:47 UTC

    Have you ever used Perl formats? Tied variables? Symbol-table manipulation? Closures? DESTROY? The Orcish maneuver? wantarray? goto? A custom import? no? A blessed coderef? A glob slot? A localized function? last to exit a calling subroutine? Walked the caller stack? Used a B:: module?

    If you don't use at least half of those in your project, should you be using Perl?

Re: To USE or not to USE
by ajdelore (Pilgrim) on Jul 22, 2003 at 16:17 UTC

    A related question. When I do this (as I often do):

    use CGI qw/header/; print header;

    Am I paying too high a price? Does importing selective functions save any cpu time, or does it just stop the namespace from getting filled?

    I suppose that I could just avoid the penalty of CGI and use:

    print "Content-type: text/html\n\n";

    Would this be a significant advantage for my scripts?

    On a related note, if someone could point me to a node on timing the execution of scripts, that would be great, as I a can't seem to find the information I need to try this myself.

    </ajdelore>

      Importing selected functions reduces namespace pollution and memory usage, since aliases do take up space. Of course you could just use the OO interface instead.

      Note that your particular example is actually a counter-example, because CGI prints "Content-Type: text/html; charset=ISO-8859-1" in order to prevent cross-site scripting attacks. This demonstrates how the "just code this part yourself because it's too simple to load a module" attitude can lead to half-assed and broken code if you don't fully understand the problem you're trying to solve.

        I do make a habit of using CGI and other CPAN modules just for that very reason. I have no plans to change this and go re-writing CPAN everytime I write code.

        I was merely asking about the performance issues from a perspective of academic curiousity. As others have pointed out, optimization is only one factor to consider along with readability, safety, development time, etc. when writing code.

        Was the last part of your post directed at me, personally? Or are you making a generalization?

        </ajdelore>

        Another advantage of being explicit in listing what functions you import is that your code is more self-documenting. You can see where each subroutine in your code comes from without having to know the default export lists of the modules you use.

      DProf is good for timing scripts, or just finding where they take their time. Try reading perldebug, it's mentioned there. Basically it's "perl -d:DProf <scriptname>" and then feed the resulting output to dprofpp. Be sure to also read the manpage to dprofpp, for example to get actual time used and not just system time, use dprofpp -r.

      C.

Re: To USE or not to USE
by chaoticset (Chaplain) on Jul 29, 2003 at 14:57 UTC


    I can actually remember thinking almost exactly the same thing.

    I have never seen it be true in my work yet. Two large, obvious reasons for this are I can't write significantly more efficient code and mine will typically be inferior in implementation anyway.

    Plenty of modules use tricks like autoloading to reduce overhead, because their author is smart enough to do that, and I'm not. I'm really, really not. (I'm not requesting an autoloader tutorial, either -- I know what it is, and have read how to use it, but it never occurs to me to use it, because I'm inexperienced. The module author usually isn't.)

    Besides that, my version will probably suck, in short.

    These two things coupled with the graceful ease of typing "use Foo;" make including the module and thinking well of the author the right choice most of the time. It's always the easy choice.

    -----------------------
    You are what you think.