How to manage the transfer of large Perl environments from one network to another

fiddler42 has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to install a customer's Perl environment on my employer's network. When said customer runs our non-Perl applications, a lot of Perl code gets exercised here and there throughout their (chip-building) flow. During the archive process at the customer site, a significant amount of Perl content (scripts, modules, etc.) gets collected. By the time everything is unpacked on my employer's network, there is a lot to sort out; for example, the PERLLIB environment variable captured at the customer site has over 430 directory entries.

Not surprisingly, my question is more system administrator-related: what would be the best method to verify that the captured Perl environment at the customer site unpacks and runs smoothly on my employer's network? I realize this is a loaded question, but, in essence, making sure the 430+ entries in the PERLLIB environment variable are ordered just right is proving difficult.

After unpacking my customer's environment, I print out the contents of @INC when a Perl script tries to run but fails to find a specific module (this is BY FAR the most common problem). I then grep/find where the module is located in the extracted directory structure and bump it to the beginning of the @INC list. It is not uncommon to see 'use lib' entries in a .pm that reference something in the customer's local directory tree, so I am sure you can imagine how fun it is to address such issues.

I apologize if this is not the best platform to post such a question, but I wasn't sure where else to go for something this complex. (Actually, I posted this Q on Stack Overflow and the first response was to post here. :-)

Comment on How to manage the transfer of large Perl environments from one network to another

Replies are listed 'Best First'.
Re: How to manage the transfer of large Perl environments from one network to another by hippo (Archbishop) on Mar 29, 2019 at 16:49 UTC
the PERLLIB environment variable captured at the customer site has over 430 directory entries. I can't be the only one who read that (twice to make sure I hadn't mis-read) and immediately thought how impressively horrendous it is. You have my sympathy. what would be the best method to verify that the captured Perl environment at the customer site unpacks and runs smoothly on my employer's network? Obvious answer: run the test suite. Now I'm going to assume that an organisation that has ended up with 430 entries in $PERLLIB probably doesn't have a test suite. So your first task is to write one. I would start by writing a test which just compiles everything in turn. That way you can run the test, keep adding to @INC or whatever needs to be done to get things to compile without breaking the ones you've already fixed. This will be a lot of work but it's sure to pay off in the end. Actually, I posted this Q on Stack Overflow and the first response was to post here. That's fine but if you could link to that SOQ then it will save possible duplication of effort in replying. Thanks.	[reply]
Re^2: How to manage the transfer of large Perl environments from one network to another by fiddler42 (Beadle) on Mar 29, 2019 at 18:18 UTC
Yes, regarding PERLLIB, you read it right. To be exact, there are 439 entries, which collectively radiate cold shafts of broken glass in every direction imaginable. :-/ Could you kindly shed some light on what you mean by a test suite? I am asking because, typically, Perl scripts are run to initialize a chip-building environment by parsing configuration files and settings, etc., and then a non-Perl compiled binary is fired up for what can sometimes be a multi-day run. Now, I could skip the non-Perl compiled binary activity for testing purposes. The different Perl scripts, though, are run from one wildly different location to another. Are you saying a master/test Perl script should be used to appropriately augment @INC as each chip-building Perl script fails? In other words, at the start of test suite activity, empty @INC and build it out to include only what is necessary. Pardon my ignorance, but I am not familiar with setting up test suites in Perl. Typical Perl scripts that I author, for example, require only a handful of custom modules, so this is turning out to be quite the learning experience for me...	[reply]
Re^3: How to manage the transfer of large Perl environments from one network to another by hippo (Archbishop) on Mar 29, 2019 at 22:37 UTC
Could you kindly shed some light on what you mean by a test suite? As luck would have it I am currently trying to produce a gentle introduction to testing in Perl. ~~You can read the work in progress on my scratchpad.~~ (Update: This document has now been published as Basic Testing Tutorial). In essence a test suite is a set of code which verifies how some other code is expected to behave. Ordinarily this is barely little more than a sequence of subroutine or method calls with associated test data but in your case can be as simple as trying to compile a catalogue of scripts. Your Perl scripts will either compile or not but you can (presumably) attempt to compile them without running and thereby do no damage. Are you saying a master/test Perl script should be used to appropriately augment @INC as each chip-building Perl script fails? Not exactly. The test suite is there just to test and should be idempotent. You could, if you so wish, construct another script to perform the augmentation and then subseqently use the test script to assess its effectiveness. Test scripts (and suites) exist for a number of reasons but regression is not the least of them. If you adjust $PERLLIB to fix one script you must ensure that doing so does not break another. This is one of many areas where automated testing proves its worth.	[reply]
Re^2: How to manage the transfer of large Perl environments from one network to another by wagnerc (Sexton) on Apr 01, 2019 at 02:44 UTC
I think one very important question to ask first, is what do you want the system to look like after this clean up is completed? Or is the task to merely document the current disorder? Chris	[reply]
Re: How to manage the transfer of large Perl environments from one network to another by swl (Prior) on Mar 29, 2019 at 22:37 UTC
If you have the entire environment then you could try a reductionist approach. Find every lib dir that contains a .pm file and add them to $ENV{PERL5LIB}. That will hopefully make everything available so the process runs to completion. Then add an END{} block to the master script to print or dump @INC (and maybe also %INC) to a file. That will give you the necessary dirs to use. Some caveats: If the process runs then you might not care that you have too many entries in @INC, and thus there is no need to reduce the set. This does not account for the order of libs, so might load incorrect versions and lead to subtle bugs. A check for .pm file uniqueness would be of benefit here. It doesn't fix the underlying problem of using so many dirs in PERL5LIB in the first place, but that sounds like a second job. If the full run takes days then it might not be worth the wait... The END block will not list all the used dirs if some scripts are called using `system`, backticks and so forth. An alternative given the last caveat might be to instrument the `lib` package to record each addition to @INC in a log file somewhere. Others can hopefully advise as to whether this is a reasonable idea and how it could be done.	[reply] [d/l] [select]
Re^2: How to manage the transfer of large Perl environments from one network to another by swl (Prior) on Mar 29, 2019 at 22:46 UTC
Another option is to use Module::ScanDeps::scandeps (see docs at https://metacpan.org/pod/Module::ScanDeps#scan_deps). If I remember correctly, this will handle `use lib` calls so if you pass it all the .pl files in your copy of the client's environment then it should find all the dependencies and their locations without needing to execute the code. You will need to parse the results, but that's perhaps a small price to pay. It also will not handle mylib or rlib calls if they are used.	[reply] [d/l]
Re: How to manage the transfer of large Perl environments from one network to another by Your Mother (Archbishop) on Mar 30, 2019 at 00:40 UTC
Of possible interest: Carton.	[reply]
Re: How to manage the transfer of large Perl environments from one network to another by karlgoethebier (Abbot) on Apr 01, 2019 at 16:36 UTC
"...more system administrator-related..." Ugly scenario. Probably it might be easier to clone the box using rsync and give it a new IP and hostname and forget about all the cruel details? Some hints for the future: In the last company i was with we had a default setup for SLES that needed about 15m to install in a virtual machine. We had at least 4 machines for each customer: customer_dev, customer_test, customer_prod_01 and customer_prod_02. All Perl stuff installed with perlbrew. Development, testing and deployment then works like a charm. We had some hundred machines with a setup like this. Best regards, Karl ŤThe Crux of the Biscuit is the Apostropheť `perl -MCrypt::CBC -E 'say Crypt::CBC->new(-key=>'kgb',-cipher=>"Blowfish")->decrypt_hex($ENV{KARL});'`Help	[reply] [d/l]