Aldebaran has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks, as I continue my headlong assault at friarhood, I'm trying to write up a meditation about Path::Tiny, which I'm re-writing my previous scripts with and writing new code with. The task before me is to bundle up my template and get in online, without publishing any sensitive data. I have no perfect divorce of content and data, but I have confined the stuff I don't want all over the net to one hash in config2.pm . The first thing I intend to do is to make a copy of config2.pm in a temporary directory without the values in the hash.

Can you "use" a lexical variable? What follow are the driver script, output and then utils2.pm, where redact() is to reside.

#!/usr/bin/perl -w use strict; use 5.010; use lib "template_stuff"; use utils1; use utils2; use utf8; use open qw/:std :utf8/; use Path::Tiny; my $current = Path::Tiny->cwd; say "current is $current"; my $return = redact("config2"); __END__ current is /home/bob/1.scripts/pages/1.qy temp dir is /tmp/backup_DEtSio scratch_file is /tmp/backup_DEtSio/batch_01/1.manifest.txt parent is /tmp/backup_DEtSio/batch_01 /tmp/backup_DEtSio $VAR1 = { 'my_sftp' => { 'username' => '', 'password' => '', 'domain' => '' }, 'my_github' => { 'password' => '', 'email' => '' } };
package utils2; require Exporter; use utils1; our @ISA = qw(Exporter); our @EXPORT = qw( redact); sub redact{ use strict; use warnings; use 5.010; use Path::Tiny; my $file = shift; #use $file; resulted in error use config2; my $sub_hash = "my_github"; my $email = $config{$sub_hash}->{'email'}; my $password = $config{$sub_hash}->{'password'}; my $tempdir = Path::Tiny->tempdir('backup_XXXXXX'); say "temp dir is $tempdir"; my $scratch_file = $tempdir->child('batch_01', '1.manifest.txt')->touc +hpath; say "scratch_file is $scratch_file"; my $parent = $scratch_file->parent; say "parent is $parent"; chdir $tempdir unless $tempdir->subsumes(Path::Tiny->cwd); system("pwd"); use Data::Dumper; print Dumper \%config; my $b = "dummy"; return $b; } 1;

Obviously, I edited out the values with the delete key, and I'd like them to simply show "redacted". There are some bad ideas chasing some good ones here. Eventually, I'd like to have an entire directory tree populated in the temp dir and then compressed by some means to a single file. If I'm going to port to windows, what is a good choice for compression software? Thanks for your comment.

Replies are listed 'Best First'.
Re: redacting from config hash
by haukex (Archbishop) on Jul 31, 2018 at 05:52 UTC
    The first thing I intend to do is to make a copy of config2.pm in a temporary directory without the values in the hash.

    I would actually suggest going the other way around - removing sensitive data is more brittle than not including it in the first place. One common approach is to package a template configuration file with dummy values as placeholders along with everything else, and then have the actual configuration file reside only locally, for example in the user's home directory, or on *NIXish machines in /etc. That way, when you bundle up your application, you won't accidentally package up your sensitive data as well. Or at the very least, keep a config2.pm.templ template file for packaging, and make sure not to package up your actual config2.pm.

    Can you "use" a lexical variable?

    If you mean accessing a my variable in config2.pm that's loaded via use config2;? No, at least not without a little bit of trickery, which is why typically package variables (our) are used for that kind of thing.

    what is a good choice for compression software?

    I'd suggest ZIP, it's a common cross-platform format. As for support in Perl, there's the core modules IO::Compress::Zip and IO::Uncompress::Unzip, and the CPAN module Archive::Zip.

    Having said all that, here's one way to produce a template file for the example data you showed. I'm using do to load it, which means that you must trust the contents of this file as it will be executed.

    Input (config2.pm):

    $CONFIG = { my_github => { email => "me\@example.com", password => "PaSsWoRd" }, my_sftp => { domain => "example.com", password => "PaSsWoRd", username => "sftpuser" } };

    Code:

    use warnings; use strict; use Data::Dumper; my $file = '/path/to/config2.pm'; my $templ = '/path/to/config2.pm.templ'; our $CONFIG; if (not my $return = do $file) { die "couldn't parse $file: $@" if $@; die "couldn't do $file: $!" unless defined $return; die "couldn't run $file" unless $return; } for my $hash (values %$CONFIG) { for my $val (values %$hash) { $val = 'redacted'; } } open my $fh, '>', $templ or die "$templ: $!"; print {$fh} Data::Dumper->new([$CONFIG],['$CONFIG']) ->Useqq(1)->Sortkeys(1)->Quotekeys(0)->Dump; close $fh;

    Output (config2.pm.templ):

    $CONFIG = { my_github => { email => "redacted", password => "redacted" }, my_sftp => { domain => "redacted", password => "redacted", username => "redacted" } };
      email => "me\@example.com", password => "PaSsWoRd"
      Escaping and code page issues are the reasons why I prefer to store things like file names, person names, passwords etc. in INI type files. Not only the at-sign of the email address has to be escaped. What if the password contains a string like %{ENV} or a backslash?

      My module of choice here is Config::Tiny. You can specify the encoding for the INI file, so that you can e.g. copy file names with ûμ⌊αυτς directly from the directory listing, etc.

      I'm very grateful for 2 excellent responses. I've coded toward both haukex's and soonix's suggestions, and I haven't completely disentangled ideas that are still crowding out the good ones here. I think I can lay out what I've done, state a better specification, and outline the things that cause trouble.

      I looked also at config1.pm and realized that my claim that I only had sensitive values in one file per workspace was false. My new specification for the copy to the temp directory is that any file that matches "config", then a number, a dot, and a "pm" shall not be copied. Meanwhile, I have a created a template file, config3.tmpl, that is of the form it needs to be less the definition of the hash, which is to be templated in. I don't see why Text::Template wouldn't be fine. The longer arc of my question is how to both to populate and depopulate such hashes. What makes it very difficult is that we seem to be acting on perl syntax itself.

      If we look haukex's output, the hash begins

      $CONFIG = { my_github => {

      , yet if we're to imitate the syntax in my source it begins:

      our %config = ( my_sftp => {

      I'm more than a little confused about what seem to be competing definitions of a hash.

      What I tried to do was extend what I have now to create a Config::Tiny ini file. I think the install went well, and judging by how quickly it went, it truly seems tiny:

      All tests successful. Files=5, Tests=49, 3 wallclock secs ( 0.06 usr 0.01 sys + 0.45 cusr + 0.10 csys = 0.62 CPU) Result: PASS RSAVAGE/Config-Tiny-2.23.tgz /usr/bin/make test -- OK Running make install Manifying 1 pod document Installing /usr/local/share/perl/5.26.1/Config/Tiny.pm Installing /usr/local/man/man3/Config::Tiny.3pm Appending installation info to /usr/local/lib/x86_64-linux-gnu/perl/5. +26.1/perllocal.pod RSAVAGE/Config-Tiny-2.23.tgz /usr/bin/make install -- OK cpan[2]> q Terminal does not support GetHistory. Lockfile removed. $ man Config::Tiny $ ./1.redact.pl Variable "$config" is not imported at template_stuff/utils2.pm line 61 +. Global symbol "$config" requires explicit package name (did you forget + to declare "my $config"?) at template_stuff/utils2.pm line 61. Compilation failed in require at ./1.redact.pl line 6. BEGIN failed--compilation aborted at ./1.redact.pl line 6.
      $ cat config2.pm package config2; use Exporter qw(import); our @EXPORT = qw(%config); our %config = ( my_sftp => { domain => '', username => '', password => '', }, my_github => { email => '', password => '', }, ); 1; $

      I don't seem to be sharing is the "our" on the definition of "our %config". Why is my syntax on the config hash clashing with that provided by man Config::Tiny and haukex?

        if we're to imitate the syntax in my source it begins: our %config = (

        You can get this kind of output from my code by changing Data::Dumper->new([$CONFIG],['$CONFIG']) to Data::Dumper->new([$CONFIG],['*config']) (change the variable names as you like); you'd just have to prepend the our yourself. See also the Data::Dumper docs.

        What makes it very difficult is that we seem to be acting on perl syntax itself.

        Yes, that's a pretty complex topic, so soonix's suggestion to use a different configuration file format is a good way to avoid that issue.

        Variable "$config" is not imported at template_stuff/utils2.pm line 61 +. Global symbol "$config" requires explicit package name (did you forget + to declare "my $config"?) at template_stuff/utils2.pm line 61.
        Why is my syntax on the config hash clashing with that provided by man Config::Tiny and haukex?

        You've done use config2;, which should give you a hash %config, which you can confirm if you comment out the line with ->write: do you see the output of print Dumper \%config;? Then, you're doing $config->write, which means "call the method write on the object $config" - not the hash %config! That's why Perl complains about the missing variable $config.

        It's possible to create a new Config::Tiny object and then copy over the config from config2.pm's %config into that object. However, I am wondering what the goal of your code is here. Are you trying to automate the conversion from config2.pm into an INI format? If the config is long, I can understand that, but if it's as short as you showed, why not do that step once, by hand? You can then use a method similar to what I showed to create a redacted version, or, again, since that's something you'd probably only be doing once (?), you can do that by hand as well.

        Taking a step back: My understanding so far is that you want to provide your code for download for others to use, is that right? And you're figuring out a way to disentangle the stuff you've written for yourself (config files etc.) from the general stuff that you want to distribute to everyone? That's pretty common, and a situation I've been in plenty of times. I've found the best approach is to look at it from a different angle and put yourself in the user's shoes, as if you knew nothing about the internals of the package: what steps would a user who downloads the package have to take to set it up? How would you describe those steps in your README for the user to follow? And then, what do you as the developer have to provide to make that as easy as possible?

        I cobbled together an example for using your %config as a hash as you used it rather than as a hashref.
        #!/usr/bin/env perl use 5.011; # implies strict + feature 'say' use warnings; use Data::Dumper; use Config::Tiny; use constant { INIFILE => 'example.ini', ENCODING => 'encoding(Windows-1252)' }; sub create { my %config = ( my_sftp => { domain => 'example.com', username => 'myname', password => 'topsecret', }, my_github => { email => "me\@example.com", password => "confidential", }, ); say Dumper \%config; my $ini = bless \%config, 'Config::Tiny'; $ini->write(INIFILE, ENCODING); say 'created ', INIFILE; } sub show { say 'reading ', INIFILE; my $ini = Config::Tiny->read(INIFILE, ENCODING); my %config = %$ini; say Dumper \%config; } create() unless -e INIFILE; show()

        There are two configuration values which don't belong into the config file :-) which I defined using constant.

        If you don't want to mess with Config::Tiny's internals, you could replace the my $ini = bless \%config, 'Config::Tiny'; line with
        my $ini = Config::Tiny->new; $ini->{$_} = $config{$_} for keys %config;
        Edit: Sorry, this was meant as a reply to Re^2: redacting from config hash...

        Update: changed 'example.ini' to INIFILE in line 40 🤦