ovedpo15 has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks!

I'm trying to create a perl script that creates bash script "on the fly" (bare with me!) that will mimic the system in another environment (inside a container).
I have an array of paths. Those paths could be their hard paths or links. The link does not have to be the ending file/directory in the path - it could be some part on the way.
For example if I have path /a/b/c and the there is a link /a/b -> /d/e.
Also those paths does not have to be realpaths (they could contain .. and . and other combinations).
Based on those paths, I want to mimic the system I have. The input of my perl script is an a file of paths (that I read into an array) and the output is a bash script that will mimic the system.
By mimic I mean that it will do the following things:
1. Create the same directory hierarchy.
2. Copy the files.
3. Create the same links.
For that I can do:
1. I can use mkdir -p to create the full hierarchy based on the path.
2. I can use scp/rsync for copying (as it's inside container).
3. I can use ln -s to create the links.
Note that I want to create directories before copying the files. I don't want to copy the directories because those directories could have files or other directories that are not located inside the array of paths.
But my question is not about this part. My question is what is the best way to parse the array of paths, find out which one of them are paths, files and links and then use it for each one of those three stages. I thought of having a hash structure and each time load a path into that structure. Sort of "mapping" the system paths into a hash structure. For each path I will have another field which will say the type of it - file, link, directory. For link I will have to keep the source and destination. Then I could iterate over the hash and create three arrays - one for mkdir, one for ln, and one for copy. After that, I can iterate over the mkdir array and print into the bash script file:
mkdir -p <path-to-dir>
Iterate over the files array and print into the bash script file:
scp user@machine:<path-to-file> <path-to-file>
Iterate over the links array and print into the bash script file:
ln -s <target> <link-name>
Between reading the paths file and creating the bash script itself I have a black box in my mind which I'm trying to implement and the question is about that black box. If there was a module which you could just "load" the paths into it and then query for what you need (like give me all of the directories...) it could be awesome, but I could not find one. What would be the best strategy here?
Hope my question makes sense. If you want me to explain some part, I will glad to.

Replies are listed 'Best First'.
Re: Creating a bash script "on the fly"
by haukex (Archbishop) on Mar 26, 2021 at 21:40 UTC

    Since you already mention rsync: it should be able to do everything you mention. See its --archive (implies --recursive --links --perms --times --group --owner --devices --specials), --hard-links, and --sparse options for mirroring a filesystem as closely as possible (perhaps --no-owner --no-group if the UID/GIDs are different and you don't want to keep them), its --files-from option to give it a list of files to sync (this implies --relative and disables --recursive, so I usually add the latter back on), plus its --filter, --exclude-from and --include-from to control which files are included and excluded. If you have files spread out sparsely across the whole fileystem that you want to sync, it's possible, but setting up the right combination of the aforementioned options takes a bit of reading in the manpage to get right. The --dry-run --itemize-changes --verbose options are useful for testing.

    Several years ago I did a bunch of research on rsync to arrive at the above combination of options and wrote a Perl frontend for it. Unfortunately it's not really something to publish, but maybe I'll get around to cleaning it up for publishing someday.

    Of course I don't want to discourage writing a Perl solution though :-) Note you don't need to create a bash script: The equivalent to mkdir -p is File::Path's make_path, scp can be done by Net::OpenSSH, and links via symlink.

Re: Creating a bash script "on the fly"
by 1nickt (Canon) on Mar 26, 2021 at 20:22 UTC

    This kind of black box module?

    use v5.12; use strict; use warnings; package MyClass { use Moo; has _map => ( is => 'lazy', builder => sub { # however you get your list put that here return { 'file1' => { type => 'file' }, 'somedir/file2' => { type => 'file' }, 'dir1' => { type => 'dir' }, 'somedir/dir2' => { type => 'dir' }, 'link1' => { type => 'link', target => 'somedi +r/file1' }, 'link2' => { type => 'link', target => 'otherd +ir/file3' }, }; }, ); sub files { my $self = shift; return [ grep { $self->_map->{$_}{type} eq 'file' } keys %{ $s +elf->_map } ]; } sub directories { my $self = shift; return [ grep { $self->_map->{$_}{type} eq 'dir' } keys %{ $se +lf->_map } ]; } sub links { my $self = shift; my @links = grep { $self->_map->{$_}{type} eq 'link' } keys %{ + $self->_map }; return [ map { { $_ => $self->_map->{$_}{target} } } @links ]; } }; my $obj = MyClass->new; say for @{ $obj->directories };
    Output:
    dir1 somedir/dir2

    Hope this helps!


    The way forward always starts with a minimal test.
      Glad to see someone write a bunch of the code that needed writing, but that I didn't have time to write. As a piece of feedback, you should probably use File::Spec to do some path manipulation, particularly since you need to create parent directories before their children.

      #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

        Or Path::Tiny which will let you create the directories (otherwise you'll probably File::Path for make_path which works akin to mkdir -p to call yourself).

        The cake is a lie.
        The cake is a lie.
        The cake is a lie.

      Hi!
      Thank you for suggestion. My question focus on how to build the builder part. I have bunch of paths. How do I know for example if there is a link in the path? Even if I will iterate over each path and check it, how do I know the link? realpath won't work for links such as: a/b -> c/d -> e/f
        I have bunch of paths. How do I know for example if there is a link in the path?

        You won't know from just looking at the string. You'll have to check the filesystem, and for each pathname break it down into its components, using e.g. -l on each one. You can break down a filename using e.g. splitdir from File::Spec, or IMO a little easier, Path::Class. Maybe something like:

        use warnings; use strict; use Path::Class qw/file dir/; my $file = file('/tmp/foolink/bar/quz'); my $prev; while (1) { die "doesn't exist: $file" unless -e $file; print $file, " is a ", -l $file ? 'link to '.readlink($file) : -f $file ? 'file' : -d _ ? 'dir' : 'special', "\n"; $prev = $file; $file = $file->parent; last if $prev eq $file; } __END__ /tmp/foolink/bar/quz is a file /tmp/foolink/bar is a dir /tmp/foolink is a link to foo /tmp is a dir / is a dir

        Update: Added a check to the above code to make sure the file exists in the first place.

Re: Creating a bash script "on the fly"
by perlfan (Parson) on Mar 28, 2021 at 05:03 UTC
    Does shar (*nix shell archive utility) provide what you need in any reasonable way?