glew has asked for the wisdom of the Perl Monks concerning the following question:

BRIEF:

Why do Perl and CPAN use so many different install directories?

install/lib, install/lib/perl5, and install/lib/site_perl

PERL version specific

etc.

DETAIL:

I install a lot of stuff from CPAN. Because I do not have root, I usually install it in ~/.cpan/install - i.e. I have passed ~/.cpan/install as --install_base to various tools.

I understand that I have to put stuff from the install path into PERL5LIB. I also understand that there are some macine specific components, like *.so, and Perl version specific components.

But: why are there so many blamed directories where stuff gets put?

Here is a highly edited directory tree:

machine-name /users/glew/.cpan/ 52 : dtree install install " bin " lib " " 5.10.0 ... lots of modules " " " x86_64-linux ... arch dependent stuff ... lots of modules " " " " auto ... lots of modules " " " " " B " " " " " " Lint " " " " B " " " " " Lint ... Q: wy is B::Lint in both .../li +b/5.10/0/x86_64-linux/B::Lint, and in auto? " " 5.8.5 ... lots of modules " " " x86_64-linux ... lots of modules " " " " auto ... lots of modules " " perl5 " " " 5.8.3 ... lots of modules " " " " x86_64-linux-thread-multi " " " " " auto ... lots of modules " " " File " " " " 5.8.3 " " " x86_64-linux " " site_perl ... lots of modules, but I'm tired of saying this " " " 5.10.0 " " " " x86_64-linux " " " " " auto " " " 5.8.5 " " " " i686-linux-64int " " " " x86_64-linux " " " " " auto " man " " man1 " " man3 " share " " man " " " man1 " " " man3

The overall pattern seems to be:

a) there are 3 basc places where stuff gets put: install/lib, install/lib/perl5, and install/lib/site_perl

b) there are subtrees according to Perl version under each, e.g. install/lib/PERL-VERSION or install/lib/site_perl/5.10.0

c) there are arch specific subtrees under these, like x86_64. But the arch names are not always consistent, e.g. x86_64-linux and i686-linux-64int

c') there are auto directories under some of the above, mainly the arch specific stff.

As you can see, I have many different versions of Perl installed. Or at least my employer does. (Reason: always want to create old scripts running, even though may install new versions of Perl for new stuff.) Currently we have versions that include 5.6.1, several versions of 5.8.x, and 5.10.0. 32/64. Threaded/non.

The default CPAN, Build.PL, etc. seem to create a lot of Perl version specific trees.

It looks like I have to install modules like File::Spec::Links multiple times - in subtrees for each of the Perl versions for scripts that will need to use them.

I understand why I need to do this if there are binaries or C libraries that need to be linked with. But why is this needed for straight scripts?

Believe it or not, but I actually am running out of diskspace. Plus the hassle of having to install multiple times.

When is it safe to put something like install/lib/5.8.5 on the PERL5LIB path for a perl 5.10.x script?

Replies are listed 'Best First'.
Re: Why do Perl and CPAN use so many different install directories?
by ikegami (Patriarch) on Jul 28, 2009 at 17:18 UTC

    lib is core.
    vendor_lib is vendor-installed.
    site_lib is user-installed.

    The arch subdirs are for binary libraries (.dll/.so), since they're not (necessarily) compatible with other versions of Perl. You don't need to specify these if you specify the parent using use lib or -I.

Re: Why do Perl and CPAN use so many different install directories?
by Old_Gray_Bear (Bishop) on Jul 28, 2009 at 18:05 UTC
    "I understand why I need to do this if there are binaries or C libraries that need to be linked with. But why is this needed for straight scripts?"

    'Straight scripts' may use version dependent features, for example. A 'straight script' (I usually call them 'programs', by the way) that works correctly in Perl 5.10 may easily use features that are unknown in Perl 5.6.1.

    "Believe it or not, but I actually am running out of diskspace."

    Disk space is very cheap (I just bought a tera-byte for my home system early this year for $110US; the price is still falling). You need to explain to your Boss that keeping additional versions of the Perl Development Environment around requires additional resources; and that is why you put in for the additional hardware.

    "Plus the hassle of having to install multiple times."

    So why are you keeping multiple Perl environments around? "Reason: always want to create old scripts running...."

    This is called having a Regression Test Bed. The test-bed that is used to verify that your Latest and Greatest Change Fix didn't break anything. If you screw around with your test-bed, by getting rid of duplication and "saving space", that safety net goes out the window. If your test-bed does not look EXACTLY like the installation environment in the field, then you can not draw a valid inference that your change didn't break anything else. You are guessing/hoping that there is nothing broken, but you can't be certain.

    "When is it safe to put something like install/lib/5.8.5 on the PERL5LIB path for a perl 5.10.x script? "

    It all depends. Do you want to guess about the correctness of your Perl 5.10.x program? Or do you want to be able to tell your Manager that you *know* that the program change hasn't broken anything?

    I suspect that a better approach here would be to segregate each different Perl version onto a separate physical machine. That way you get two things clear from the onset -- you can't clobber one version with another by accident; and you eliminate the temptation to 'fix' the installation by consolidating stuff to save space.

    These don't have to be big powerful machines, by the way. I once ran a QA/Regression lab on two dozen old Toshiba P-II and P-III lap-tops. They weren't the fast thing on the planet, but they could run the entire regression suite in just over 19 hours. The QA lab was the last stop in the trickle-down chain for hardware. (The Bosses get new laptops; the Tech Lead gets one of the bosses old laptops; I get the TL's old laptop; the Lab get my old machine. So the Lab gets a hardware refresh every 12-18 months.)

    What every you do, keep in mind the reason that you have multiple Perl installations. Breaking a Regression Test-Bed is not something you want to do with out a long hard examination of the consequences, both for the company and for your own credibility.

    ----
    I Go Back to Sleep, Now.

    OGB

      The annoying problem that I am wrestling with now is that I cannot seem to persuade CPAN to install HTTP::Daemon in /users/glew/.cpan/install/lib/site_perl/5.10.0/HTTP/Daemon, or whatever directory it needs to be in for 5.10.0. Instead, even though I am using 5.10.0, and even though many CPAN modules have installed under the 5.10.0 directories, CPAN finds that there is an up to date version under /users/glew/.cpan/install/lib/site_perl/5.8.5/HTTP/Daemon Therefore my question: should I put the 5.8.5 directories in the PERL5LIB path for a 5.10.0 script? Is there a way to tell CPAN to install versions under 5.10.0 directories, even though 5.8.5 exists and is up to date? Perhaps I should create completely separate ~/.cpan trees for the different versions? Unfortunately, separate machines is not an option. The only disks I have access to are Linux, NFS, shared amongst all machines.
        I zorched my ~/.cpan directory tree, and re-installed the modules I use. Now I see only the generic perl5 stuff, and perl5/*/5.10.0/* stuff. I conjecture that CPAN will install a new package in either generic perl5 directories, or in perl5/*/$VERSION directories. e.g. perl5/*/5.10.0 However, if it finds an existing directory for an older version of perl (not the package), and the package is unchanged, it will leave the package in the old directory, e.g. perl5/*/5.8.5 I conjecture that it is usually safe to layer the many, many, directories, with the most recent version at the front of the path, and the oldest at the end. That way, if CPAN has installed a new version, it will be found before the old. However, you could get into trouble, in ways such as (a) your code uses an obsolete module - e.g. you get perl5/*/5.8.5/*/Old/Module rather than perl5/*/5.10.0/*/New/Module. And you find the old module still on your path. Or (b) you have something like perl5/site_perl/*/5.8.5/Module that you have made local changes to. If you install perl5/5.8.5/Module, the newer standard version may override your local changes. I conjecture that it might be better to have different CPAN trees for every version of the compiler you deal with. E.g. .cpan/5.10.0/, .cpan/5.8.5/, ... But I see nobody recommending this. So maybe I shouldn't bother.
Re: Why do Perl and CPAN use so many different install directories?
by locked_user sundialsvc4 (Abbot) on Jul 29, 2009 at 02:43 UTC

    One of the most-important considerations for any “real-world production shop” is to be able to continue running old stuff while gradually introducing new stuff. There is, in a word, “a gigantic gulf that is forever fixed” between “development” and “production!”

    Very, very quickly, the inherent complexity of the situation exceeds the capacity of anyone to manage it without explicit controls. (Hair follicles are a very precious thing, as you too will soon discover if you are a male of this species.) Disk space, on the other hand, is extremely cheap. This is why you see some of the things that you do, especially in the Perl world. Perl is “where the rubber hits the road” in practical production data-processing.