Re: Getting all subpaths from a path

Replies are listed 'Best First'.
Re^2: Getting all subpaths from a path by ovedpo15 (Pilgrim) on Mar 29, 2021 at 14:08 UTC
Hi! You are probably right, it becomes an X-Y problem. I will try to do my best to explain the idea of what I'm trying to do and what lead me to opening the current question. As I explained in previous topic (https://www.perlmonks.org/?node_id=11130389), I'm trying to create a bash script "on the fly" output an array of paths that does the following three stages: 1. Create the same directory hierarchy. 2. Copy the files. 3. Create the same links. For that I can do: 1. I can use mkdir -p to create the full hierarchy based on the path. 2. I can use scp/rsync for copying (as it's inside container). 3. I can use ln -s to create the links. So I wanted to build a structure that will contain all the information (links, directories, files). I came up with the following structure: { "/": { "type": "dir", "files": [ { "usr": { "type": "dir", "files": [ { "vsa": { "type": "link-dir", "source": "/root/site/tools/gauv" } } ] } }, { "root": { "type": "dir", "files": [ { "site": { "type": "dir", "files": [ { "tools": { "type": "dir", "files": [ { "gauv": { "type": "dir", "files": [ { "pkgs": { "type": "dir", "files": [ { "python3": { "type": "dir", "files": [ { "3.6.3a": { "type": "dir-link", "source": "/usr/vsa/pkgs +/python3/3.6.3" } }, { "3.6.3": { "type": "dir", "links": [ { "lib": { "type": "dir", "links": [] } }, { "bin": { "type": "dir", "links": [] } } ] } } ] } } ] } } ] } } ] } } ] } } ] } } ] } } [download] Which contains only the path `/usr/vsa/pkgs/python3/3.6.3a/bin/python3.6` with it's links (as I described in the question). So I will parse each path and create this structure. Once I have this structure, I can extract all of the directories, files, and links (dir links and file links) into arrays and use them to build the bash script (write bash commands based on those paths into a file). That's the purpose of that whole idea. So my strategy was: 1. Parse each path (by getting all subpaths and links) and insert into an array on all path. 2. Check the type of each path in the array (link, file, directory) and insert into the structure. 3. Extract arrays of dir paths, array of dir links, array of files, array of file links. 4. Iterate over each array and create the bashe script. I'm having trouble with steps 1-2. In the current node I ask about step 1. I wanted to parse each path and split it into sub paths. Then check each subpath if it's a link and if so, I will insert the target of the link into the array, change all of the other subpaths (for example if I have (/a,/a/b/,/a/b/d) and /a/b->/e/f then it should be (/a,/a/b/,/e/f,/e/f/d)). I should also handle two special cases here: 1. Relative links - not sure how to handle with it currently. I have tried to handle with only local relative links like /a/b/c -> /a/b/d but it's getting complicated. 2. Recursive links - for example /a/b -> /c/d -> e/f ...). That's why I used `while(1)`. To sum up, those are the big questions: 1. What would be the best design strategy to implement here? Was my idea good? 2. If so, Is my suggested structure good enough? How would you change it? Now, for what you suggested: 1. The idea is to create a bash script that copies the environment into a container. rsync can help me here but I it can come in handy in step 4 (while copying files, instead of scp). I can't use rsync on the whole directory because it will then copy files that are not in the array of paths. Assume you have in the input array of path (/a/b/1.file, /a/b/2.file) and you also have 3.file inside /a/b. I don't want to copy it, only 1.file and 2.file so rsync on the whole directory won't work here. It can be used to copy files (which is the same as scp). 2. I have tried now splitdir and you are right, it's better than splitting by "/". Is there a subroutine that can give me all the subpaths of a path? 3. Yes it's a good subroutine but I can't use it yet because I need to parse each path - find out if it's a link. abs_path will give me the final path but I also want to have the recursive links (like I mentioned before /a/b -> /c/d -> e/f, in that case abs_path will just get /e/f and ignore /c/d). I hope this post will clarify some opened question. If not, I will be more than glad to answer more. I'm sorry if I didn't explain the question good enough before. Thanks for the help until now! Also, some more code that I wrote, while trying to make it work (just for reference. also really sorry it's messy and with bad variable names): foreach my $f (@arr) { if (-l $f) { print($f, " is a link to ",readlink($f) , "\n"); my @a = split("/",$f); my $result; my $counter = 0; my $last_files_block = $st{"/"}{"files"}; while (1) { unless ($counter < scalar(@a)) { last; } my $x = $a[$counter]; if ($x eq '') { $counter += 1; next; } if ($counter + 1 == scalar(@a)) { if (-f $f) { my $found = 0; foreach my $v (@{$last_files_block}) { if (defined($v->{$x})) { $found = 1; last; } } if ($found == 0) { my %vsaaa = ("type" => "link-file", "source" => re +adlink($f)); my %st1 = ($x => \%vsaaa ); push(@{$last_files_block}, \%st1); } my $last = $f; while (1) { my $c = readlink($last); if (-l $c) { $last = $c; if (index($c,"/") != -1) { push(@arr,$c); } else { my $found1 = 0; foreach my $v (@{$last_files_block}) { if (defined($v->{$x})) { $found = 1; last; } } if ($found1 == 0) { my %vsaaa = ("type" => "file"); my %st1 = ($x => \%vsaaa ); push(@{$last_files_block}, \%st1); } } } else { if (index($c,"/") != -1) { push(@arr,$c); } else { my $found1 = 0; foreach my $v (@{$last_files_block}) { if (defined($v->{$x})) { $found = 1; last; } } if ($found1 == 0) { my %vsaaa = ("type" => "file"); my %st1 = ($x => \%vsaaa ); push(@{$last_files_block}, \%st1); } } last; } } } if (-d $f) { my $found = 0; foreach my $v (@{$last_files_block}) { if (defined($v->{$x})) { $found = 1; last; } } if ($found == 0) { my $n = readlink($f); if (index($n,"/") == -1) { my $dirname = dirname($f); $n = "$dirname/$n"; #TODO: what if relativ +e? } my %vsaaa = ("type" => "dir-link", "source" => + $n); my %st1 = ($x => \%vsaaa ); push(@{$last_files_block}, \%st1); } } last; } my $found = 0; foreach my $v (@{$last_files_block}) { if (defined($v->{$x})) { $last_files_block = $v->{$x}{"files"}; $counter += 1; $found = 1; last; } } if ($found == 0) { my %vsaaa = ("type" => "dir", "files" => [] ); my %st1 = ($x => \%vsaaa ); push(@{$last_files_block}, \%st1); $last_files_block = $vsaaa{"files"}; $counter += 1; } } } elsif (-f $f) { print($f, " is a file\n"); my @a = split("/",$f); my $result; my $counter = 0; my $last_files_block = $st{"/"}{"files"}; while (1) { unless ($counter < scalar(@a)) { last; } my $x = $a[$counter]; if ($x eq '') { $counter += 1; next; } if ($counter + 1 == scalar(@a)) { my $found = 0; foreach my $v (@{$last_files_block}) { if (defined($v->{$x})) { $found = 1; last; } } if ($found == 0) { my %vsaaa = ("type" => "file"); my %st1 = ($x => \%vsaaa ); push(@{$last_files_block}, \%st1); } last; } my $found = 0; foreach my $v (@{$last_files_block}) { if (defined($v->{$x})) { $last_files_block = $v->{$x}{"files"}; $counter += 1; $found = 1; last; } } if ($found == 0) { my %vsaaa = ("type" => "dir", "files" => [] ); my %st1 = ($x => \%vsaaa ); push(@{$last_files_block}, \%st1); $last_files_block = $vsaaa{"files"}; $counter += 1; } } } elsif (-d $f) { print($f, " is a dir\n"); my @a = split("/",$f); my $result; my $counter = 0; my $last_files_block = $st{"/"}{"files"}; while (1) { unless ($counter < scalar(@a)) { last; } my $x = $a[$counter]; if ($x eq '') { $counter += 1; next; } my $found = 0; my $found_link = 0; foreach my $v (@{$last_files_block}) { if (defined($v->{$x})) { if ($v->{$x}{"type"} eq "dir-link" \|\| $v->{$x}{"type"} + eq "link-file") { $found_link = 1; last; } $last_files_block = $v->{$x}{"files"}; $counter += 1; $found = 1; last; } } if ($found_link == 1) { last; } if ($found == 0) { my %vsaaa = ("type" => "dir", "files" => [] ); my %st1 = ($x => \%vsaaa ); push(@{$last_files_block}, \%st1); $last_files_block = $vsaaa{"files"}; $counter += 1; } } } else { #TODO: When can it happen other than path does not exist o +r permission denied? print($f, " is a special\n"); } } [download]	[reply] [d/l] [select]
Re^3: Getting all subpaths from a path by haukex (Archbishop) on Apr 02, 2021 at 08:39 UTC
I think the significant bit of information that was missing previously (the "X" in the XY Problem) is what you mentioned here: "I'm trying to create a Singularity recipes builder." By this I'm guessing you mean Singularity, and their "Recipes" to build containers, more specifically, something you can execute in their `Singularity` file `%post` section (which gets executed with `/bin/sh`) to build the container? By "recipes builder", do you mean you want to write a Perl script that will generate commands that can be executed by `/bin/sh` to reproduce a certain environment (directory structure, links, etc.)? In other words, you want to write a Perl script that will generate a sequence of `mkdir -p` commands, followed by `cp` commands, followed by `ln -s` commands, such that when Singularity builds the container and executes the script containing these commands, those dirs/links/files will be present in the generated `squashfs` image? (By the way, why not use the built-in `%files` section?) Note that I had to deduce all this means you need to describe your task better `:-)` Remember to explain the "X" you're trying to accomplish, plus sample input, expected output for that input - something like a high-level SSCCE. You haven't shown your input, which I am guessing is the filesystem that you want to mirror into the container? One way you could provide an SSCCE for us is to give us a list of commands to recreate the directory structure. You also haven't shown your expected output, i.e. the `/bin/sh` script you want to produce. Interesting: Note that both input and output are basically the same thing! So if I'm correct with all my guesses so far, the problem can be more or less reduced to: a Perl script that will basically round-trip a `/bin/sh` script containing `mkdir`, `cp`, and `ln` commands. However, since that's a lot of guessing, I'm going to stop here for now - please let us know if the above is correct or not, and if not, what it is you're actually trying to do. (Also, looking over choroba's sample code, it looks like a good starting point.)	[reply] [d/l] [select]
Re^4: Getting all subpaths from a path by ovedpo15 (Pilgrim) on Apr 02, 2021 at 15:21 UTC
Yes, I'm trying to create those recipes on the fly. User gives me all the paths that he thinks are needed to run the tool inside the container (he gives a file that contains those paths and I read them into an array). With those paths I can build the recipe. In the %setup section I will create the directories, in the %files section I will copy files and in the %post section I will create the links. So I don't really want to create a shell script, I do want to build the recipe with Perl. But I didn't want to talk about Singularity because I guessed most of the people here are not familiar with it. So I tried to simplify it to creating a shell script (aka the recipe) that creates those directories, copies files and creates links. So if we moved to talk about recipes, the purpose of the Perl script is to build the recipe, based on all the paths that users thinks are needed for running his tool in the container. So the input is really the paths, as I explained, and the output is the recipe (aka the shell script). So my question is still remains. Given the paths, I want to build some structure that I could easily extract all the files/links/directories and use them for creating the recipe file. If you think there is a better way of creating it, I'm all ears. choroba's answer is a good start but I had some questions that I commented under it.	[reply]
Re^5: Getting all subpaths from a path by haukex (Archbishop) on Apr 02, 2021 at 18:33 UTC
Re^6: Getting all subpaths from a path by ovedpo15 (Pilgrim) on Apr 02, 2021 at 20:27 UTC
Some notes below your chosen depth have not been shown here
Re^3: Getting all subpaths from a path by ovedpo15 (Pilgrim) on Mar 31, 2021 at 13:00 UTC
Can someone suggest on strategy on how to solve it? I tried some other similar things but it got too complicated and failed.	[reply]
Re^4: Getting all subpaths from a path by choroba (Cardinal) on Mar 31, 2021 at 16:13 UTC
You cannot create the links before both the source and target directories exist. So, postpone their creation to the end. The directories are processed from the shortest to the longest, so we always know the parent path exists. You should also add a check that a link doesn't point outside of the given directory tree. I created the following Makefile to experiment with your data: Read more... (749 Bytes) And this was the script 1.pl: #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; use Path::Tiny qw{ path }; my $path = path(shift); my $paths = $path->visit( sub { $_[1]->{$_} = (-f) ? 'file' : (-l) ? [readlink] : (-d) ? 'dir' : 'unknown' }, { recurse => 1 } ); my @links; for my $found (sort { length $a <=> length $b } keys %$paths) { if ('file' eq $paths->{$found}) { say qq(cp '$found' "\$target/$found"); } elsif ('dir' eq $paths->{$found}) { say qq(mkdir "\$target/$found"); } elsif (ref [] eq ref $paths->{$found}) { my $to = path($paths->{$found}[0]); $to = $to->relative(path($found)->absolute->parent) if $to->is_absolute; my $source = path('$target', $found); push @links, qq(ln -s "$to" "$source"); } } say for @links; [download] It's just a toy. Use some of the ShellQuote modules to fix the filenames; but for the example given, it works. `map{substr$_->[0],$_->[1]\|\|0,1}[\\|\|{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^ARGV,3]`	[reply] [d/l] [select]
Re^5: Getting all subpaths from a path by ovedpo15 (Pilgrim) on Mar 31, 2021 at 18:48 UTC
Re^4: Getting all subpaths from a path by tybalt89 (Monsignor) on Mar 31, 2021 at 18:41 UTC
It seems like you are trying to re-invent 'tar'.	[reply]
Re^5: Getting all subpaths from a path by ovedpo15 (Pilgrim) on Mar 31, 2021 at 18:58 UTC
Re^4: Getting all subpaths from a path by haukex (Archbishop) on Mar 31, 2021 at 15:18 UTC
Sorry, I've been quite busy, but I hope to get a chance to read your post tomorrow.	[reply]
Re^5: Getting all subpaths from a path by ovedpo15 (Pilgrim) on Mar 31, 2021 at 15:24 UTC

{
  "/": {
    "type": "dir",
    "files": [
      {
        "usr": {
          "type": "dir",
          "files": [
            {
              "vsa": {
                "type": "link-dir",
                "source": "/root/site/tools/gauv"
              }
            }
          ]
        }
      },
      {
        "root": {
          "type": "dir",
          "files": [
            {
              "site": {
                "type": "dir",
                "files": [
                  {
                    "tools": {
                      "type": "dir",
                      "files": [
                        {
                          "gauv": {
                            "type": "dir",
                            "files": [
                              {
                                "pkgs": {
                                  "type": "dir",
                                  "files": [
                                    {
                                      "python3": {
                                        "type": "dir",
                                        "files": [
                                          {
                                            "3.6.3a": {
                                              "type": "dir-link",
                                              "source": "/usr/vsa/pkgs
+/python3/3.6.3"
                                            }
                                          },
                                          {
                                            "3.6.3": {
                                              "type": "dir",
                                              "links": [
                                                {
                                                  "lib": {
                                                    "type": "dir",
                                                    "links": []
                                                  }
                                                },
                                                {
                                                  "bin": {
                                                    "type": "dir",
                                                    "links": []
                                                  }
                                                }
                                              ]
                                            }
                                          }
                                        ]
                                      }
                                    }
                                  ]
                                }
                              }
                            ]
                          }
                        }
                      ]
                    }
                  }
                ]
              }
            }
          ]
        }
      }
    ]
  }
}
[download]

/usr/vsa/pkgs/python3/3.6.3a/bin/python3.6

while(1)

foreach my $f (@arr) {
    if (-l $f) {
        print($f, " is a link to ",readlink($f) , "\n");
        my @a = split("/",$f);
        my $result;
        my $counter = 0; 
        my $last_files_block = $st{"/"}{"files"};
        while (1) {
            unless ($counter < scalar(@a)) {
                last;
            }
            my $x = $a[$counter];
            if ($x eq '') {
                $counter += 1;
                next;
            }
            if ($counter + 1 == scalar(@a)) {
                if (-f $f) {
                my $found = 0;
                foreach my $v (@{$last_files_block}) {
                    if (defined($v->{$x})) {
                        $found = 1;
                        last;
                    }
                }
                if ($found == 0) {
                    my %vsaaa = ("type" => "link-file", "source" => re
+adlink($f));
                    my %st1 = ($x => \%vsaaa );
                    push(@{$last_files_block}, \%st1);
                }
                my $last = $f;
                while (1) {
                    my $c = readlink($last);
                    if (-l $c) {
                        $last = $c;
                        if (index($c,"/") != -1) {
                            push(@arr,$c);
                        } else {
                            my $found1 = 0;
                            foreach my $v (@{$last_files_block}) {
                                if (defined($v->{$x})) {
                                    $found = 1;
                                    last;
                                }
                            }
                            if ($found1 == 0) {
                                my %vsaaa = ("type" => "file");
                                my %st1 = ($x => \%vsaaa );
                                push(@{$last_files_block}, \%st1);
                            }
                        }
                    } else {
                        if (index($c,"/") != -1) {
                            push(@arr,$c);
                        } else {
                            my $found1 = 0;
                            foreach my $v (@{$last_files_block}) {
                            if (defined($v->{$x})) {
                                $found = 1;
                                last;
                            }
                            }
                            if ($found1 == 0) {
                                my %vsaaa = ("type" => "file");
                                my %st1 = ($x => \%vsaaa );
                                push(@{$last_files_block}, \%st1);
                            }
                        }
                        last;
                    }
                }
                }
                if (-d $f) {
                    my $found = 0;
                    foreach my $v (@{$last_files_block}) {
                        if (defined($v->{$x})) {
                            $found = 1;
                            last;
                        }
                    }
                    if ($found == 0) {
                        my $n = readlink($f);
                        if (index($n,"/") == -1) {
                            my $dirname = dirname($f);
                            $n = "$dirname/$n"; #TODO: what if relativ
+e?
                        }
                        my %vsaaa = ("type" => "dir-link", "source" =>
+ $n);
                        my %st1 = ($x => \%vsaaa );
                        push(@{$last_files_block}, \%st1);
                    }
                }
                last;
            }
            my $found = 0;
            foreach my $v (@{$last_files_block}) {
                if (defined($v->{$x})) {
                    $last_files_block = $v->{$x}{"files"};
                    $counter += 1;
                    $found = 1;
                    last;
                }
            }
            if ($found == 0) {
                my %vsaaa = ("type" => "dir", "files" => [] );
                my %st1 = ($x => \%vsaaa );
                push(@{$last_files_block}, \%st1);
                $last_files_block = $vsaaa{"files"};
                $counter += 1;
            }
        }
    } elsif (-f $f) {
        print($f, " is a file\n");
        my @a = split("/",$f);
        my $result;
        my $counter = 0; 
        my $last_files_block = $st{"/"}{"files"};
        while (1) {
            unless ($counter < scalar(@a)) {
                last;
            }
            my $x = $a[$counter];
            if ($x eq '') {
                $counter += 1;
                next;
            }
            if ($counter + 1 == scalar(@a)) {
                my $found = 0;
                foreach my $v (@{$last_files_block}) {
                    if (defined($v->{$x})) {
                        $found = 1;
                        last;
                    }
                }
                if ($found == 0) {
                    my %vsaaa = ("type" => "file");
                    my %st1 = ($x => \%vsaaa );
                    push(@{$last_files_block}, \%st1);
                }
                last;
            }
            my $found = 0;
            foreach my $v (@{$last_files_block}) {
                if (defined($v->{$x})) {
                    $last_files_block = $v->{$x}{"files"};
                    $counter += 1;
                    $found = 1;
                    last;
                }
            }
            if ($found == 0) {
                my %vsaaa = ("type" => "dir", "files" => [] );
                my %st1 = ($x => \%vsaaa );
                push(@{$last_files_block}, \%st1);
                $last_files_block = $vsaaa{"files"};
                $counter += 1;
            }
        }
    } elsif (-d $f) {
        print($f, " is a dir\n");
        my @a = split("/",$f);
        my $result;
        my $counter = 0; 
        my $last_files_block = $st{"/"}{"files"};
            while (1) {
            unless ($counter < scalar(@a)) {
                last;
            }
            my $x = $a[$counter];
            if ($x eq '') {
                $counter += 1;
                next;
            }
            my $found = 0;
            my $found_link = 0;
            foreach my $v (@{$last_files_block}) {
                if (defined($v->{$x})) {
                if ($v->{$x}{"type"} eq "dir-link" || $v->{$x}{"type"}
+ eq "link-file") {
                    $found_link = 1;
                    last;
                }
                $last_files_block = $v->{$x}{"files"};
                $counter += 1;
                $found = 1;
                last;
                }

            }
            if ($found_link == 1) {
                last;
            }
            if ($found == 0) {
                my %vsaaa = ("type" => "dir", "files" => [] );
                my %st1 = ($x => \%vsaaa );
                push(@{$last_files_block}, \%st1);
                $last_files_block = $vsaaa{"files"};
                $counter += 1;
            }
        }
    } else {
            #TODO: When can it happen other than path does not exist o
+r permission denied? 
            print($f, " is a special\n");
    }
}
[download]

[reply]
[d/l]
[select]

I think the significant bit of information that was missing previously (the "X" in the XY Problem) is what you mentioned here: "I'm trying to create a Singularity recipes builder." By this I'm guessing you mean Singularity, and their "Recipes" to build containers, more specifically, something you can execute in their Singularity file %post section (which gets executed with /bin/sh) to build the container?

By "recipes builder", do you mean you want to write a Perl script that will generate commands that can be executed by /bin/sh to reproduce a certain environment (directory structure, links, etc.)? In other words, you want to write a Perl script that will generate a sequence of mkdir -p commands, followed by cp commands, followed by ln -s commands, such that when Singularity builds the container and executes the script containing these commands, those dirs/links/files will be present in the generated squashfs image?

(By the way, why not use the built-in %files section?)

Note that I had to deduce all this means you need to describe your task better :-) Remember to explain the "X" you're trying to accomplish, plus sample input, expected output for that input - something like a high-level SSCCE.

You haven't shown your input, which I am guessing is the filesystem that you want to mirror into the container? One way you could provide an SSCCE for us is to give us a list of commands to recreate the directory structure.

You also haven't shown your expected output, i.e. the /bin/sh script you want to produce.

Interesting: Note that both input and output are basically the same thing!

So if I'm correct with all my guesses so far, the problem can be more or less reduced to: a Perl script that will basically round-trip a /bin/sh script containing mkdir, cp, and ln commands.

However, since that's a lot of guessing, I'm going to stop here for now - please let us know if the above is correct or not, and if not, what it is you're actually trying to do. (Also, looking over choroba's sample code, it looks like a good starting point.)

[reply]
[d/l]
[select]

[reply]

Can someone suggest on strategy on how to solve it? I tried some other similar things but it got too complicated and failed.

[reply]

I created the following Makefile to experiment with your data: