How to get the unique canonical path of a given path?

ovedpo15 has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: How to get the unique canonical path of a given path? by hippo (Archbishop) on Jul 13, 2022 at 15:39 UTC
Does Path::Tiny::realpath do what you want? `$ perl -MPath::Tiny -E 'say path(q#../script.sh#)->realpath' /home/script.sh` [download] 🦛	[reply] [d/l]
Re^2: How to get the unique canonical path of a given path? by ovedpo15 (Pilgrim) on Jul 13, 2022 at 18:31 UTC
Hi, thanks for the suggestion but not quite. I don't want to resolve any links in the path. I thought of adding: `unless ($path =~ /^\//) { $path = catdir(getcwd,$path); }` [download] What do you think?	[reply] [d/l]
Re^3: How to get the unique canonical path of a given path? by Fletch (Bishop) on Jul 13, 2022 at 21:58 UTC
I don’t think you’re going to find much off the shelf because the posix/*nix way of "canonical" path is to expand out and replace symlinks (see Pathname Resolution). I think closest you might get would be rolling something using File::Spec and `splitdir` to unroll (maybe with glob for tilde handling beforehand), process for dot(s), then concatenate the remaining elements back. The cake is a lie. The cake is a lie. The cake is a lie.	[reply] [d/l]
Re^3: How to get the unique canonical path of a given path? by hippo (Archbishop) on Jul 13, 2022 at 21:26 UTC
I don't want to resolve any links in the path. Well then, I don't understand what you mean because AFAICS `realpath` does not do that. If you can provide an SSCCE (ideally in the form of a test) showing precisely what output you want for a given input and how `realpath` fails in that regard perhaps we can guide you further. Update: yes, it does do that - see replies. 🦛	[reply] [d/l] [select]
Re^4: How to get the unique canonical path of a given path? by ikegami (Patriarch) on Jul 14, 2022 at 14:50 UTC
Re^5: How to get the unique canonical path of a given path? by hippo (Archbishop) on Jul 14, 2022 at 15:27 UTC
Re: How to get the unique canonical path of a given path? by Tux (Canon) on Jul 15, 2022 at 13:43 UTC
Resolving `/../` may lead to illegal or unwanted locations. The part before the `/../` might be a symbolic link pointing to somewhere completely outside of the path you are changing Your solution, as mentioned by others, is not safe for leading `./`, `../` or a range of `../../..` that would cause to go beyond the root With you code safe-guarded against the above remarks, here is a compare of the methods mentioned in this thread plus the one I would use: Cwd::abs_path. Note that `abs_path` returns `undef` for non-existing path. #!/usr/bin/perl use 5.018003; use warnings; use Cwd qw( abs_path ); use Path::Tiny; use File::Spec; my @pth = qw( /a/b/c/d/../../../e /a/../b/./c//d ../scripting ./tmp /tmp/../../../tmp ); sub resolves { my ($p, $r) = @_; printf "%-20s -> %s\n", $p, $r // "$p does not resolve"; } # resolves say "OP"; foreach my $pth (@pth) { my @c = reverse split m{/+}, $pth; # /+ removes empty elements my @c_new; while (@c) { my $component = shift @c; next unless length ($component); $component eq "." and @c and next; if ($component eq ".." and @c) { my $i = 0; while ($i <= $#c && $c[$i] =~ m/^\.{0,2}$/) { $i++; } splice @c, $i, 1; next; } push @c_new => $component; } @c = reverse @c_new; $c[0] =~ m/^\.\.?$/ or unshift @c => ""; resolves $pth, join "/" => @c; } say "Cwd::abs_path"; foreach my $pth (@pth) { resolves $pth, abs_path ($pth); + } say "Path::Tiny::path"; foreach my $pth (@pth) { resolves $pth, Path::Tiny::path ($pth); + } say "File::Spec::canonpath"; foreach my $pth (@pth) { resolves $pth, File::Spec->canonpath ($pth); + } [download] which produces OP /a/b/c/d/../../../e -> /a/e /a/../b/./c//d -> /b/c/d ../scripting -> ../scripting ./tmp -> ./tmp /tmp/../../../tmp -> /tmp Cwd::abs_path /a/b/c/d/../../../e -> /a/b/c/d/../../../e does not resolve /a/../b/./c//d -> /a/../b/./c//d does not resolve ../scripting -> /home/scripting ./tmp -> /home/merijn/tmp /tmp/../../../tmp -> /tmp Path::Tiny::path /a/b/c/d/../../../e -> /a/b/c/d/../../../e /a/../b/./c//d -> /a/../b/c/d ../scripting -> ../scripting ./tmp -> tmp /tmp/../../../tmp -> /tmp/../../../tmp File::Spec::canonpath /a/b/c/d/../../../e -> /a/b/c/d/../../../e /a/../b/./c//d -> /a/../b/c/d ../scripting -> ../scripting ./tmp -> tmp /tmp/../../../tmp -> /tmp/../../../tmp [download] Enjoy, Have FUN! H.Merijn	[reply] [d/l] [select]
Re: How to get the unique canonical path of a given path? by rizzo (Curate) on Jul 13, 2022 at 23:05 UTC
From the question at stackoverflow you cited: Normalize a pathname by collapsing redundant separators and up-level references so that A//B, A/B/, A/./B and A/foo/../B all become A/B. Given this task, it makes no sense to start your path with a double dot because the directory above your first one, the one you want to change to, is not known. Same thing with /a/../../b, etc.	[reply]
Re: How to get the unique canonical path of a given path? by ikegami (Patriarch) on Jul 14, 2022 at 15:00 UTC
It works good for most cases but I had trouble with paths that start, for example, with ".." (like "../script.sh") What output do you want for `../script.sh`?	[reply] [d/l]
Re: How to get the unique canonical path of a given path? by tybalt89 (Monsignor) on Jul 16, 2022 at 14:28 UTC
Are these answers correct? `#!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11145493 use warnings; for my $path ( qw( /a/b/c/d/../../../e /a/../b/./c//d ../invalid_do_not_change ./tmp /tmp/../../../tmp A//B A/B/ A/./B A/foo/../B ) ) { local $_ = $path; 1 while s{ /+(?=/) \| # multiple /// ^/..(?=/) \| # stay at root /\z \| # remove trailing / (?<=/)(?!\.\./)[^/]+/\.\./ \| # remove 'name/../' (?<![^/])\./ # remove ./ }{}x; printf "%30s -> %s\n", $path, $_; }` [download] Outputs: `/a/b/c/d/../../../e -> /a/e /a/../b/./c//d -> /b/c/d ../invalid_do_not_change -> ../invalid_do_not_change ./tmp -> tmp /tmp/../../../tmp -> /tmp A//B -> A/B A/B/ -> A/B A/./B -> A/B A/foo/../B -> A/B` [download]	[reply] [d/l] [select]
Re^2: How to get the unique canonical path of a given path? by pryrt (Abbot) on Jul 16, 2022 at 15:47 UTC
`/tmp/../../../tmp -> /tmp` Unless I'm very much mistaken, wouldn't that try to go two directories above the root before trying to find a `tmp` subdirectory of the non-existant location? In `/tmp/../../../tmp`, the first `/tmp/..` pair resolves to just `/`, so that is effectively `/../../tmp` , but that path is rather meaningless, because you cannot `cd ..` from `/`. Nevermind; ignore my objection from the spoiler. I just tried the `cd /tmp/../../../tmp` and found it did go to the `/tmp` directory. And `perl -le 'use autodie; open my $fh, ">", "/tmp/../../../tmp/worked.txt"; print {$fh} "it worked";'` works as expected as well, so apparently that weird notation is a perfectly-valid syntax. Sorry.	[reply] [d/l] [select]
Re^3: How to get the unique canonical path of a given path? by tybalt89 (Monsignor) on Jul 16, 2022 at 17:43 UTC
It works because on a *nix system the root directory is pretty much defined by " .. is the same as . "	[reply]
Re^2: How to get the unique canonical path of a given path? by ikegami (Patriarch) on Jul 17, 2022 at 18:58 UTC
[Nothing to see here]	[reply]
Re: How to get the unique canonical path of a given path? by bliako (Abbot) on Jul 17, 2022 at 18:04 UTC
You got a lot of good answers. I will copy-paste an idea from shell scripting (reminded to me by pryrt's answer): change-dir to the location and then ask for cwd. It implies that you can ~~find~~ extract the `dirname` of said path (edit: just for cd'ing to it, so you don't need to resolve it, the system will (try)) and also, (edit: most importantly) have the permissions to change-dir to that. Edit: Also, these paths must be real so to changedir to them. Edit: so perhaps not very practical in some use-cases. I use this to find the containing dir of a shell script in bash. But as I said, caveats exist. Oh! and it will be super slow compared to any programmatic way. 1 min edit: If you chdir to a symlink dir, some systems' cwd will report the symlinked dir instead of resolving it, so ... bw, bliako	[reply] [d/l]
Re: How to get the unique canonical path of a given path? by perlfan (Parson) on Jul 17, 2022 at 14:43 UTC
`readlink -f` is the `bash` command to resolve this. This may help you discover the answer for Perl. Perl has readlink, but YMMV.	[reply] [d/l] [select]
Re: How to get the unique canonical path of a given path? by Anonymous Monk on Jul 14, 2022 at 15:05 UTC
See also `File::Spec->canonpath()`. If you like what it does, fine. If not, the documentation lists some of the pitfalls involved in this operation. `File::Spec` is a core module.	[reply] [d/l] [select]
Re^2: How to get the unique canonical path of a given path? by ikegami (Patriarch) on Jul 14, 2022 at 15:53 UTC
It doesn't. They don't remove '..' because it might change the meaning of the path in case of symlinks.	[reply]