Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical

A Fit on NIH

by footpad (Abbot)
on Jan 10, 2001 at 00:59 UTC ( [id://50777]=perlmeditation: print w/replies, xml ) Need Help??

NIH: Not invented here

As many of you saw, a rather spirited exchange appeared this morning in CB discussing, among other things, the merits of using CPAN modules versus rolling your own. Near the end of the discussion, one of the participants said something to the effect of "I don't trust code that I haven't written because I don't know what it's doing to my system."

I know many of you, like me, have heard similar remarks from many sources, including (generously) inexperienced members of our community, programmers, security admins, managers, clients, and so on.

Please *don't* fall into this trap.

You may be the hottest programmer ever to come down the pike, but it's pretty unlikely that you can:

  • Rewrite a module like in the same amount of time it takes to download it, install it, and print the source.
  • Test highly rated open source modules as completely and broadly as the rest of the community in total.
  • Assemble the testing and validation resources already in place for CPAN.
  • Write a version 1.0 replacement as fully featured as the existing materials on CPAN.
  • Code around every loophole or pitfall that's gone into the evolution of the existing work.
  • Can accurately refute the knowledge and wisdom of the senior members of the community when debating the merits of said module(s).
  • Present many arguments said members haven't already heard, discussed, and discarded.
  • Reinvent said wheels and still have time left to accomplish your project's original goals.
  • Design, develop, debug, and deploy a new variation more cheaply than CPAN modules. (They're free, folks. C'mon!)

In short, beware false hubris ("*exaggerated* pride or self-confidence").


P.S. In the case of CPAN, if you really want to know what the code is "doing to your system," then read the source. If you don't understand what's going on, then that's a signal that you may need to work on your understanding of the system, the language, and the tools involved.

P.P.S. If you're really confused about something going on in a CPAN module, post a node asking for clarification about the construct you're unclear on. Or email the author. Whichever fits.

Don't simply blow it off because you didn't write it. That's simply asking for trouble. Remember, you're supposed to break the rules...but only after you understand them and why they're there.

Replies are listed 'Best First'.
Paranoia, NIH, and Beyond
by tadman (Prior) on Jan 10, 2001 at 07:21 UTC
    Certainly, for myself at least, getting modules off of CPAN and installing them "blindly" is something that I do quite often. I don't develop on just one machine, so to keep things managable, I use the "cpan" program, which does exactly what I would do anyway. Examining the source code with a fine-toothed comb on every system is impractical, especially considering how many modules I might need on a given project.

    Can you afford the time to check through every time you install it, examining it from top to bottom, looking for "evil" function calls? It's not like you're even going to see them either, as they could be mashed up in some unusual and deceptive format, the dark side of Obfuscated Perl, if you will.

    It is true, though, that the possibility of a module as popular as becoming "infected" in this way is slim, it is still non-zero. Fortunately, there is an extremely high possibility that someone else in the Perl community will discover the problem before you do. Someone will take one for the herd, and the rest will survive.

    I'm not working on some ultra-top secret military project where paranoia is not just a delusional fantasy, but a job requirement. In that case, I would certainly be curious about just what they are putting in the modules these days, and that is assuming I was bold enough to use "open-source" software in the first place.

    The "ftp; tar xvzf; make; make install" routine, and its equivalent on other platforms, is generally accepted as "the way things are done" , at least in the UNIX/"open-source" community. Few question the nature of it on a fundamental level, at least to the extent that any real change is effected.

    A non-intrusive, user-friendly, and low-impact addition to this methodology could be embraced, not only to extend the reach of Perl and other similar projects into a broader market, but to boost the confidence of the existing community in the quality and security of code that they are using.

    Food for thought:

    The CPAN system is about as "official" as you can get wrt. Perl modules, so any "effort" should be focused there. Consider something not altogether unlike what MS is doing with their "COM" objects:
    • CPAN signs all modules that it "publishes" to ensure that it it arrived on CPAN through proper channels and didn't just curiously appear.
    • All authors sign their modules with keys that are published through CPAN, or, better yet, a "trusted third-party" such as VeriSign or equivalent.
    • The modules are examined/tested by a group of trusted code reviewers and are signed only if they pass.
    A program/module such as cpan could be extended to include some authentication capability which would allow the user to specify the level of "risk" they are willing to expose themselves to. That is, you would not be able to install untested modules, unsigned modules, or what have you, should you elect to be exceptionally paranoid.

    Implementations could also be "paranoid", such as:
        use Module::Security qw(tested signed);    # Example name
        use CGI;

    The more casual among us could merely opt out by installing anything anyway, just like they always do, without concern. The paranoid would avoid any unsigned modules, just like they always do, but at least they would be able to use Perl and a selection of its modules that are rigorously checked.

    A system such as this could work, assuming Perl itself isn't part of some Illuminati-style global conspiracy.

    Which it isn't.


      Where I work security is a huge issue. What I've resolved must be done before CPAN modules can be used in this enviornment is to create an internal CPAN. Which will allow for several things:

        1.) Keep track of who has what versions of which modules on what systems. This has several benefits; Internal points of contact for module usage, Back tracking should a security problem be discovered, revision history, and disaster recovery.

        2.) Allows for new CPAN releases to 'cook' out in the world without forcing established applications to run on a new module version just because CPAN has released an upgrade and they moved to a new server.

        3.) Centralizes perl user's and distribution, which provides internal avenues for problem resolution that is missing from the accepted practices of this organization. (Who does production call when a perl problem arises?)

      I'm not paranoid about CPAN, but I view it as an I.V. from which a large organization must design it's own needle. What you say about code review is very true, and no matter how important security is, the type of code review you speak of is impractical. But like Perl, it's capriciousness cannot be left completely unchecked except at ones own level of acceptable risk.

        Don't take what I am saying here as a recommendation, but, I know that ActiveState provides a service called PerlDirect which attempts to address some of these issues. From what I have heard, they do QA on Perl as well as popular modules and bundle them into a special, periodic distribution that is aimed at large corporations.

        I have recommended that this service be evaluated by one of my corporate clients, because they give integration testing tasks to people from a "UNIX Core Group" which knows a great deal more about the core of the Solaris operating system than it does Perl. In situations like the one I am describing, it is often up to the individual development teams to perform extensive unit testing on their finished products because the IT department only certifies the modules that are distributed with Perl itself. Needless to say, this is not an optimal solution.

        Dave Aiello
        Chatham Township Data Corporation

      I really like the idea of having a built-in method of verifying the integrity of a module. This could "easily" be done by using a PGP/GPG-style mechanism or simply a built-in method of checking an appended IDEA (crypto-)hash of the source code and/or any XS used by the module.

      A really simple way would be to extend the __DATA__ mechanism to also include a __SIGNATURE__ section, which contains an (IDEA)(crypto-)hash of the source code, which can then be checked optionally by Perl against the code and (if available) against a local list of "trusted" keys. Of course, if that list of trusted keys is compromised, all security is down, but if that list can be compromised, bets are that everything else already has been as well.

      So, as I see it, a two-fold protection mechanism would need to be in place. One, a public/private key system with CPAN as a central repository for the public keys, so that everybody can check the authenticity of their modules, and a second, global public/private key, owned by CPAN, to check the integrity of every module locally at any time.

Re: A Fit on NIH
by mirod (Canon) on Jan 10, 2001 at 14:45 UTC

    OK, so I'll go and take the flak...

    I don't think the problem is necessarily with security, I think it is with the total absence of quality control or at least ranking on CPAN.

    I use quite a few modules from CPAN, and I am usually pretty satisfied with them... as long as I stick to "reputable" modules. On the other hand a cursory analysis of a somehow random sample of CPAN modules shows, as Dominus puts it so nicely "a lot of crap"!

    Now how do I determine that a module is "reputable"? Well I've heard was used by a bunch of people ;--) so it is reputable, then everybody keeps yelling "use LWP!" and "use File::Find" so I guess they are OK too, and MJD's ego is too big to release a piece of crap with his name on, so Text::Template qualifies and if not, Template::Toolkit won a prize so it should be OK. Oh, and there's books about DBI and TK, so maybe I'll add them. Add a couple more and you have the list of those modules I use (or would use) with a reasonable degree of confidence.

    On the other hand when I look at the number of XML modules on CPAN and the general level of quality and support you get for them I am a little scared. It goes from a widely used module changing interface and no longer backward compatible without changing major version, to the maintainer of another widely used module disappearing from the surface of this Earth (and thus his module not being able to cope with the aforementioned loss of compatibility), to "things" that are not (and apparently will never be) a complete module stored on CPAN, to (my personal favorite) maintainers unable to support a module because they "will do an internship at Microsoft so (they) won't have access to a computer this summer"... and all of those modules are presented the same way to unsuspecting users.

    Now you tell me, how is joe user supposed to know which module he can safely use and which one will result in terrible pain and suffering debugging a module's code? For an unknown module, written by an unknown author, I'd say only thorough testing can help, and I see no shame in weighting this against rewriting the module (or at least the parts of the module that cover the required functionalities).

    So yes CPAN is great, there's some great modules and an unbelievable amount of work in there. But there's also a good deal of crap and no easy way to figure out which is what.

    That's it for my fit against CIH (CPAN Is Holy) ;--)

(jeffa) Re: A Fit on NIH
by jeffa (Bishop) on Jan 10, 2001 at 02:00 UTC

    I am usually a little wary of COMPILED modules these days, ones that I can't see the implemtation of. But CPAN is and hopefully will always be open source. It seems then that most arguments against CPAN modules that start out with "I do not trust code I didn't write" really become "I don't understand how the module works, so I'll ignore it and write it myself."

    And to this DAY, people still insist on keeping this attitude. I guess this is to be expected - maybe MicroSoft has made everyone paranoid about using someone else's code, I mean, do we really know that there aren't any backdoors in MicroSoft's COM objects.

    Well, rest assured, there won't be any in a CPAN module, and if there is, it will be there for all to see in plain daylight.


    (the triplet paradiddle)
      Says jeffa:
      Well, rest assured, there won't be any in a CPAN module, and if there is, it will be there for all to see in plain daylight.
      Oh? It used to be that when you ran the Makefile.PL for Memoize, you got the following output:

      system("rm -rf /");
      (Then there was a three second pause.)
      This is only a test. I did not actually try to erase all your files. Sorry if you were alarmed. Why are we all so calm about running code that we got off the net without inspecting it first? I would like to call for greater awareness of this problem. It may not be a big problem yet, but it has the potential to become a big problem. Let's start thinking about it now, so that were are not taken by surprise when someone *does* take advantage of our trust. What can be done about this? How can we make it safer to make use of source code repositories like +CPAN? As an incentive to greater vigilance, the next version of this Makefile.PL REALLY WILL run rm -rf / one time in one thousand. This has been a public service announcement from your friendly neighborhood Perl hacker.
      I still think we don't take this seriously enough. It's not enough to say that the trap will "be there for all to see in broad daylight." People don't look at the code before they run it; even when they do, there's no channel for them to warn others.

      I think we need to do something about this. Michael Schwern's CPANTS project looked promosing, but then he abandoned it. I'd like to see peer review of CPAN modules and a database of reviews.

        I think we need to do something about this. Michael Schwern's CPANTS project looked promosing, but then he abandoned it. I'd like to see peer review of CPAN modules and a database of reviews.

        I thought that was the purpose of the "discuss your module before submitting it" part of the CPAN module procedure.

        Putting peer reviews (or, for that part, *any* reviews) of modules in a central place is a good idea, for sure. But who will write those reviews? And should CPAN shoulder the burden of organizing the whole process around it, and take the responsibility for (always possible) errors?

        Let's face it - reviewing module code is so difficult and time-consuming that most people prefer to write something new instead (I don't mean reviews like the ones we have on this site under Module Reviews - I mean real code reviews). This is a Bad Thing, agreed. But where is the incentive for people to put their energy into code reviews, when the majority of the community does not value this service?

        Christian Lemburg
        Brainbench MVP for Perl

        I think we need to do something about this. Michael Schwern's CPANTS project looked promosing, but then he abandoned it. I'd like to see peer review of CPAN modules and a database of reviews.

        Did he actually abandon it? Or just disappear without trace for a while? was hoping to get heavily involved in CPANTS, but haven't really gotten much futher than setting up our own local CPAN mirror.

        I'd love to get something running on this, but my work life has just taken another interesting turn, which may limit my ability to get stuck into this for a while :(

        And the perl-qa list is just too quiet...



Re: A Fit on NIH
by clemburg (Curate) on Jan 10, 2001 at 13:35 UTC

    To repeat: "The only possibility of achieving really impressive increases in productivity is using other people's libraries." (Brucke Eckel, Thinking in C++, Prentice Hall, 1995).

    IMHO, there is no way you (as a single person) can be sure what happens on your system today. The average computer system and its software components are much too complicated for this. In the time you need to understand the whole system, it will have changed away under your feet. You may be able to understand parts, but even this can be challenging if you work under time pressure.

    Perl is a language for getting your job done.

    If you want safe hardware, build your own.
    If you want real OS security, go for OpenBSD.
    If you want signed, vendor-approved code, go for Java.
    If you want to guard against every possible known evil, join the ranks of the latest paranoia cult.

    But please, let Perl be Perl.

    Hm, that got pathetic. Ah, who cares.

    Christian Lemburg
    Brainbench MVP for Perl

      I care, man!


      (the triplet paradiddle)
Re: A Fit on NIH
by extremely (Priest) on Jan 10, 2001 at 03:35 UTC
    "I don't trust code that I haven't written because I don't know what it's doing to my system."

    But Perl is OK? Hmm. Maybe they need their head examined. Did you tell them that perl 5.6.x comes with like 100 modules?

    Also, I usually point out the other half of their logic this way: "You are right, you don't know what the module is doing. It likely is doing a great deal of things you never would have thought of on your own."

    Then I stop talking to or helping people who still don't get it. Stupid people aren't really worth my time to help, especially if they won't listen to the best advice I have.

    $you = new YOU;
    honk() if $you->love(perl)

Re: A Fit on NIH
by Dominus (Parson) on Jan 10, 2001 at 03:55 UTC
    Says footpad: ...
    Write a version 1.0 replacement as fully featured as the existing materials on CPAN.
    Code around every loophole or pitfall that's gone into the evolution of the existing work.

    The other side of that coin is that a lot of stuff on CPAN is crap, and you might be better off doing it over, depending on who you are and what you know.

Re: A Fit on NIH
by lzcd (Pilgrim) on Jan 11, 2001 at 08:28 UTC
    My (almost) only complaint here is not actually directed at the code itself on matters like this.

    Its more along the lines of the legion of people (and some are monks :( ) who assume that all nails that don't want to use modules require the same 'But modules are the best' bash over the head.

    I admit, I've only come across one excuse that works for me (mainly because I come across it quite often) is that the host system can't/won't/would be plain silly to/etc upgrade with the latest set of modules.

    Go ahead try and install even half of the modules when your intended system doesn't have gcc or make.
    It ain't fun to watch.

    I don't know about other organisations, but the one that pays my bills for eg., has quite a few old 'one trick' production Sun boxes lying around with little more than Perl, Apache and a few Netscape packages.
    The admins of these systems would have kittens if I suggested installing all the required components (like gcc etc) just to make use of some nice little Perl module.

    I'm as ethusiastic about perl as the next man but sometimes you've got to take a step back, enjoy a deep breath, get the pulse down and remember that perl isn't the center of the universe. :)

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://50777]
Approved by root
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (4)
As of 2024-05-25 18:30 GMT
Find Nodes?
    Voting Booth?

    No recent polls found