I'm confused about the path you're taking here (pun intended). Are you still working on Creating a bash script "on the fly" from a few days ago? If so, then I have to say I think you're creating yourself an XY Problem. I've already pointed you to:
- rsync as a potential solution,
- File::Spec's splitdir, instead you're doing split("/", $current_path) and other manual filename operations, plus it contains other helpful functions, and it's a core module,
- Cwd's abs_path, also a core module,
- Path::Class as very nice wrapper around File::Spec and other utility functions, and
- my script relink, which implements an algorithm to walk a chain of symlinks in its resolvesymlink function.
Also, I'm not sure what you're testing with (index($link,"/") == -1)? Note links to directories don't need to end oncontain a slash; you won't know what the link is pointing to until you check the filesystem. (Update: Looking at your code again, I see a couple more ways in which certain symlinks would break it.)
But what if the link is relative (for example ../../abc)?
You're discovering one of the many intricacies of resolving symlinks. As explained in the documentation of File::Spec's canonpath: "If /foo on your system is a symlink to /bar/baz, then /foo/../quux is actually /bar/quux, not /quux as a naive ../-removal would give you."
Sorry, but based on my interpretation of everything above, I would strongly suggest that you first learn more about symlinks and handling filenames before trying to implement this. Even though I did a lot of research and testing when I wrote the relink script, and I trust my own code, I'm still not sure I've fully understood every little detail and edge case that could happen. It's not an easy topic.
But anyway, I suggest that even before that, maybe you could explain why the above solutions are not acceptable to you, and why you need to reinvent this particular wheel? I'm sure we could come up with a better solution for your actual problem.
If you're still working on Creating a bash script "on the fly", i.e. you're just trying to replicate a directory structure, then could you explain why a solution such as rsync, or simply copying over the links, is not sufficient? (You do know you can get most *NIX tools to not follow symlinks, right?)
Minor edits.
| [reply] [d/l] [select] |
Hi!
You are probably right, it becomes an X-Y problem. I will try to do my best to explain the idea of what I'm trying to do and what lead me to opening the current question. As I explained in previous topic (https://www.perlmonks.org/?node_id=11130389), I'm trying to create a bash script "on the fly" output an array of paths that does the following three stages:
1. Create the same directory hierarchy.
2. Copy the files.
3. Create the same links.
For that I can do:
1. I can use mkdir -p to create the full hierarchy based on the path.
2. I can use scp/rsync for copying (as it's inside container).
3. I can use ln -s to create the links.
So I wanted to build a structure that will contain all the information (links, directories, files). I came up with the following structure:
{
"/": {
"type": "dir",
"files": [
{
"usr": {
"type": "dir",
"files": [
{
"vsa": {
"type": "link-dir",
"source": "/root/site/tools/gauv"
}
}
]
}
},
{
"root": {
"type": "dir",
"files": [
{
"site": {
"type": "dir",
"files": [
{
"tools": {
"type": "dir",
"files": [
{
"gauv": {
"type": "dir",
"files": [
{
"pkgs": {
"type": "dir",
"files": [
{
"python3": {
"type": "dir",
"files": [
{
"3.6.3a": {
"type": "dir-link",
"source": "/usr/vsa/pkgs
+/python3/3.6.3"
}
},
{
"3.6.3": {
"type": "dir",
"links": [
{
"lib": {
"type": "dir",
"links": []
}
},
{
"bin": {
"type": "dir",
"links": []
}
}
]
}
}
]
}
}
]
}
}
]
}
}
]
}
}
]
}
}
]
}
}
]
}
}
Which contains only the path /usr/vsa/pkgs/python3/3.6.3a/bin/python3.6 with it's links (as I described in the question). So I will parse each path and create this structure. Once I have this structure, I can extract all of the directories, files, and links (dir links and file links) into arrays and use them to build the bash script (write bash commands based on those paths into a file). That's the purpose of that whole idea.
So my strategy was:
1. Parse each path (by getting all subpaths and links) and insert into an array on all path.
2. Check the type of each path in the array (link, file, directory) and insert into the structure.
3. Extract arrays of dir paths, array of dir links, array of files, array of file links.
4. Iterate over each array and create the bashe script.
I'm having trouble with steps 1-2. In the current node I ask about step 1. I wanted to parse each path and split it into sub paths. Then check each subpath if it's a link and if so, I will insert the target of the link into the array, change all of the other subpaths (for example if I have (/a,/a/b/,/a/b/d) and /a/b->/e/f then it should be (/a,/a/b/,/e/f,/e/f/d)). I should also handle two special cases here:
1. Relative links - not sure how to handle with it currently. I have tried to handle with only local relative links like /a/b/c -> /a/b/d but it's getting complicated.
2. Recursive links - for example /a/b -> /c/d -> e/f ...). That's why I used while(1).
To sum up, those are the big questions:
1. What would be the best design strategy to implement here? Was my idea good?
2. If so, Is my suggested structure good enough? How would you change it?
Now, for what you suggested:
1. The idea is to create a bash script that copies the environment into a container. rsync can help me here but I it can come in handy in step 4 (while copying files, instead of scp). I can't use rsync on the whole directory because it will then copy files that are not in the array of paths. Assume you have in the input array of path (/a/b/1.file, /a/b/2.file) and you also have 3.file inside /a/b. I don't want to copy it, only 1.file and 2.file so rsync on the whole directory won't work here. It can be used to copy files (which is the same as scp).
2. I have tried now splitdir and you are right, it's better than splitting by "/". Is there a subroutine that can give me all the subpaths of a path?
3. Yes it's a good subroutine but I can't use it yet because I need to parse each path - find out if it's a link. abs_path will give me the final path but I also want to have the recursive links (like I mentioned before /a/b -> /c/d -> e/f, in that case abs_path will just get /e/f and ignore /c/d).
I hope this post will clarify some opened question. If not, I will be more than glad to answer more. I'm sorry if I didn't explain the question good enough before. Thanks for the help until now!
Also, some more code that I wrote, while trying to make it work (just for reference. also really sorry it's messy and with bad variable names):
foreach my $f (@arr) {
if (-l $f) {
print($f, " is a link to ",readlink($f) , "\n");
my @a = split("/",$f);
my $result;
my $counter = 0;
my $last_files_block = $st{"/"}{"files"};
while (1) {
unless ($counter < scalar(@a)) {
last;
}
my $x = $a[$counter];
if ($x eq '') {
$counter += 1;
next;
}
if ($counter + 1 == scalar(@a)) {
if (-f $f) {
my $found = 0;
foreach my $v (@{$last_files_block}) {
if (defined($v->{$x})) {
$found = 1;
last;
}
}
if ($found == 0) {
my %vsaaa = ("type" => "link-file", "source" => re
+adlink($f));
my %st1 = ($x => \%vsaaa );
push(@{$last_files_block}, \%st1);
}
my $last = $f;
while (1) {
my $c = readlink($last);
if (-l $c) {
$last = $c;
if (index($c,"/") != -1) {
push(@arr,$c);
} else {
my $found1 = 0;
foreach my $v (@{$last_files_block}) {
if (defined($v->{$x})) {
$found = 1;
last;
}
}
if ($found1 == 0) {
my %vsaaa = ("type" => "file");
my %st1 = ($x => \%vsaaa );
push(@{$last_files_block}, \%st1);
}
}
} else {
if (index($c,"/") != -1) {
push(@arr,$c);
} else {
my $found1 = 0;
foreach my $v (@{$last_files_block}) {
if (defined($v->{$x})) {
$found = 1;
last;
}
}
if ($found1 == 0) {
my %vsaaa = ("type" => "file");
my %st1 = ($x => \%vsaaa );
push(@{$last_files_block}, \%st1);
}
}
last;
}
}
}
if (-d $f) {
my $found = 0;
foreach my $v (@{$last_files_block}) {
if (defined($v->{$x})) {
$found = 1;
last;
}
}
if ($found == 0) {
my $n = readlink($f);
if (index($n,"/") == -1) {
my $dirname = dirname($f);
$n = "$dirname/$n"; #TODO: what if relativ
+e?
}
my %vsaaa = ("type" => "dir-link", "source" =>
+ $n);
my %st1 = ($x => \%vsaaa );
push(@{$last_files_block}, \%st1);
}
}
last;
}
my $found = 0;
foreach my $v (@{$last_files_block}) {
if (defined($v->{$x})) {
$last_files_block = $v->{$x}{"files"};
$counter += 1;
$found = 1;
last;
}
}
if ($found == 0) {
my %vsaaa = ("type" => "dir", "files" => [] );
my %st1 = ($x => \%vsaaa );
push(@{$last_files_block}, \%st1);
$last_files_block = $vsaaa{"files"};
$counter += 1;
}
}
} elsif (-f $f) {
print($f, " is a file\n");
my @a = split("/",$f);
my $result;
my $counter = 0;
my $last_files_block = $st{"/"}{"files"};
while (1) {
unless ($counter < scalar(@a)) {
last;
}
my $x = $a[$counter];
if ($x eq '') {
$counter += 1;
next;
}
if ($counter + 1 == scalar(@a)) {
my $found = 0;
foreach my $v (@{$last_files_block}) {
if (defined($v->{$x})) {
$found = 1;
last;
}
}
if ($found == 0) {
my %vsaaa = ("type" => "file");
my %st1 = ($x => \%vsaaa );
push(@{$last_files_block}, \%st1);
}
last;
}
my $found = 0;
foreach my $v (@{$last_files_block}) {
if (defined($v->{$x})) {
$last_files_block = $v->{$x}{"files"};
$counter += 1;
$found = 1;
last;
}
}
if ($found == 0) {
my %vsaaa = ("type" => "dir", "files" => [] );
my %st1 = ($x => \%vsaaa );
push(@{$last_files_block}, \%st1);
$last_files_block = $vsaaa{"files"};
$counter += 1;
}
}
} elsif (-d $f) {
print($f, " is a dir\n");
my @a = split("/",$f);
my $result;
my $counter = 0;
my $last_files_block = $st{"/"}{"files"};
while (1) {
unless ($counter < scalar(@a)) {
last;
}
my $x = $a[$counter];
if ($x eq '') {
$counter += 1;
next;
}
my $found = 0;
my $found_link = 0;
foreach my $v (@{$last_files_block}) {
if (defined($v->{$x})) {
if ($v->{$x}{"type"} eq "dir-link" || $v->{$x}{"type"}
+ eq "link-file") {
$found_link = 1;
last;
}
$last_files_block = $v->{$x}{"files"};
$counter += 1;
$found = 1;
last;
}
}
if ($found_link == 1) {
last;
}
if ($found == 0) {
my %vsaaa = ("type" => "dir", "files" => [] );
my %st1 = ($x => \%vsaaa );
push(@{$last_files_block}, \%st1);
$last_files_block = $vsaaa{"files"};
$counter += 1;
}
}
} else {
#TODO: When can it happen other than path does not exist o
+r permission denied?
print($f, " is a special\n");
}
}
| [reply] [d/l] [select] |
I think the significant bit of information that was missing previously (the "X" in the XY Problem) is what you mentioned here: "I'm trying to create a Singularity recipes builder." By this I'm guessing you mean Singularity, and their "Recipes" to build containers, more specifically, something you can execute in their Singularity file %post section (which gets executed with /bin/sh) to build the container?
By "recipes builder", do you mean you want to write a Perl script that will generate commands that can be executed by /bin/sh to reproduce a certain environment (directory structure, links, etc.)? In other words, you want to write a Perl script that will generate a sequence of mkdir -p commands, followed by cp commands, followed by ln -s commands, such that when Singularity builds the container and executes the script containing these commands, those dirs/links/files will be present in the generated squashfs image?
(By the way, why not use the built-in %files section?)
Note that I had to deduce all this means you need to describe your task better :-) Remember to explain the "X" you're trying to accomplish, plus sample input, expected output for that input - something like a high-level SSCCE.
You haven't shown your input, which I am guessing is the filesystem that you want to mirror into the container? One way you could provide an SSCCE for us is to give us a list of commands to recreate the directory structure.
You also haven't shown your expected output, i.e. the /bin/sh script you want to produce.
Interesting: Note that both input and output are basically the same thing!
So if I'm correct with all my guesses so far, the problem can be more or less reduced to: a Perl script that will basically round-trip a /bin/sh script containing mkdir, cp, and ln commands.
However, since that's a lot of guessing, I'm going to stop here for now - please let us know if the above is correct or not, and if not, what it is you're actually trying to do. (Also, looking over choroba's sample code, it looks like a good starting point.)
| [reply] [d/l] [select] |
Can someone suggest on strategy on how to solve it? I tried some other similar things but it got too complicated and failed.
| [reply] |
You might try using a hash of hash keyed by a directory where the value is a hash of all the sub-directories or undef if this is a terminal entry. That makes editing links as easy as copying the value to a new key and deleting the old key.
Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond
| [reply] |
#!/usr/bin/perl
use strict; # https://perlmonks.org/?node_id=11130527
use warnings;
local $_ = <<END;
/usr/vsa/pkgs/python3/3.6.3a/bin/python3.6
/usr/vsa -> /root/site/tools/gauv
/usr/vsa/pkgs/python3/3.6.3a -> 3.6.3
END
s!^(\S+/)([^\s/]+) -> \K([^/]\S*)!$1$3!gm; # fixup relative
my $was = '';
while( $was ne $_ )
{
my %links;
$links{$1} = $2 while /^(\S+) -> (\S+)/gm;
$was = $_;
for my $link ( sort keys %links )
{
s~(?<!\S)\Q$link\E(?=/)~$links{$link}~; # replace links
}
}
my %subpaths;
m~^(\S+)(?=[\s/])(?{$subpaths{$1}++})(*FAIL)~m; # get all subpaths
my @subpaths = sort { length $a <=> length $b } sort keys %subpaths;
print "$_\n" for @subpaths;
Outputs:
/usr
/root
/usr/vsa
/root/site
/root/site/tools
/root/site/tools/gauv
/root/site/tools/gauv/pkgs
/root/site/tools/gauv/pkgs/python3
/root/site/tools/gauv/pkgs/python3/3.6.3
/root/site/tools/gauv/pkgs/python3/3.6.3a
/root/site/tools/gauv/pkgs/python3/3.6.3/bin
/root/site/tools/gauv/pkgs/python3/3.6.3/bin/python3.6
| [reply] [d/l] [select] |