polettix has asked for the wisdom of the Perl Monks concerning the following question:

Venerables,

I'm trying to get rid of references to parent directory in a path, but it seems that File::Spec doesn't help me much:

#!/usr/bin/perl use strict; use warnings; use File::Spec; my $somepath = "/var/log/../../home/poletti/../../etc/passwd"; print "starting path: [$somepath]\n"; print "wanted path: [/etc/passwd]"; print "canonpath: [", File::Spec->canonpath($somepath), "]\n"; print "rel2abs: [", File::Spec->rel2abs($somepath), "]\n"; print "abs2rel: [", File::Spec->rel2abs($somepath, "/"), "]\n"; my @portions = File::Spec->splitpath($somepath); print "splitpath + catpath: [", File::Spec->catpath(@portions), "]\n"; __END__ __output__ starting path: [/var/log/../../home/poletti/../../etc/passwd] wanted path: [/etc/passwd] canonpath: [/var/log/../../home/poletti/../../etc/passwd] rel2abs: [/var/log/../../home/poletti/../../etc/passwd] abs2rel: [/var/log/../../home/poletti/../../etc/passwd] splitpath + catpath: [/var/log/../../home/poletti/../../etc/passwd]
Is there any module that addresses this problem? TIA,

Flavio (perl -e "print(scalar(reverse('ti.xittelop@oivalf')))")

Don't fool yourself.

Replies are listed 'Best First'.
Re: Cleaning up a path
by ikegami (Patriarch) on Apr 13, 2005 at 17:29 UTC

    The problem is that (in *ix) the long path "/var/log/../../home/poletti/../../etc/passwd" does not necessarily compact to "/etc/passwd". For example, if "/home/poletti" was a symbolic link to "/drive2/home/poletti", the long path would compact to "/drive2/etc/passwd". That's why canonpath does not eliminate "segment/../" in *ix.

    Update: Here's some code that does the trick, ignoring symbolic links:

    use File::Spec::Unix (); sub remove_dot_dot { local $_ = $_[0]; $_ = File::Spec::Unix->canonpath($_); 1 while s#[^/]+/\.\./##g; s#/[^/]+/\.\.$#/#g; s#^[^/]+/\.\.$#.#g; return $_; }

    Tests:

    Update2: I got confused and thought ../foo/bar should give bar. Fixed.

Re: Cleaning up a path
by tlm (Prior) on Apr 13, 2005 at 17:26 UTC

    The following solution is not perfect, but works for me most of the time (the version below assumes that $path points to a subdir, but it can easily be adapted to the general case in which $path can also point to a file):

    sub cleanpath { my $path; use Cwd; my $cwd = cwd; chdir $path or die "Can't chdir to $path: $!\n"; my $cleanpath = cwd; chdir $cwd or die "Can't return to $cwd$ $!\n"; return $cleanpath; }
    The principal shortcoming of this approach is that it only applies to directories that your program can visit; this excludes non-existent directories, unfortunately. (Hence, a more useful implementation, unlike the sketch above, would not simply die if it fails to visit the input path.)

    Basically, in Unix, if a symbolic links render the path notation ambiguous; in the simplest terms, if foo points to some (non-symbolic) directory bar, then the formal expression foo/.. is ambiguous, because it can be interpreted as either the parent of foo or the parent of bar. (In fact, even foo/. is ambiguous.) The definition of "cleaning up a path" hinges on how you want to resolve this ambiguity. There are uses for both forms of clean-up.

    the lowliest monk

Re: Cleaning up a path
by Roy Johnson (Monsignor) on Apr 13, 2005 at 19:54 UTC
    I can't believe there isn't a solution out there that does this. But since I didn't find one, here's mine. It handles the symlink issue. Rather minimally tested.

    Caution: Contents may have been coded under pressure.
Re: Cleaning up a path
by derby (Abbot) on Apr 13, 2005 at 17:38 UTC

    Well, if you're not worried about portablity:

    !/usr/bin/perl use strict; use warnings; my $somepath = "/var/log/../../home/poletti/../../etc/passwd"; my( @true ); foreach( split( /\//, $somepath ) ) { $_ eq ".." ? pop( @true ) : push( @true, $_ ); } print join( '/', @true ), "\n";

    Let the golf begin

    -derby
      1while s#[^/]+/..(/|$)##
      ikegami's remarks still apply.

      Caution: Contents may have been coded under pressure.

        aye, works great (if you canonise first)

        sub remove_dot_dot_roy_edited { local $_ = $_[0]; $_ = File::Spec::Unix->canonpath($_); 1while s#[^/]+/..(/|$)##; return length($_) ? $_ : '.'; }

      While that works for

      /foo/../bar -> /bar foo/../bar -> bar /foo/bar/../../moo -> /moo /var/log/../../home/poletti/../../etc/passwd -> /etc/passwd

      it doesn't work for

      /foo/.. -> <- Should be / foo/.. -> <- Should be . (?) .. -> <- Should be .. ../foo -> foo <- Should be ../foo ../foo/bar -> foo/bar <- Should be ../foo/bar foo/bar/../.. -> <- Should be . (?) /foo/bar/../.. -> <- Should be /
      This breaks in the case depicted by ikegami, but can be adjusted to cope with the portability problem:
      #!/usr/bin/perl use strict; use warnings; use File::Spec; my $somepath = "/var/log/../../home/poletti/../../etc/rc.d"; my @true; $somepath = "/../../../"; # Let's pretend $somepath is a directory... or is it? :) /^\.\./ ? pop( @true ) : push( @true, $_ ) foreach( File::Spec->splitdir($somepath) ); print File::Spec->catdir(@true), "\n";

      Flavio (perl -e "print(scalar(reverse('ti.xittelop@oivalf')))")

      Don't fool yourself.
        /foo/.. -> / <- Fixed /foo/../bar -> /bar <- Still works foo/.. -> <- Should be . (?) foo/../bar -> bar <- Still works .. -> <- Should be .. ../foo -> foo <- Should be ../foo ../foo/bar -> foo/bar <- Should be ../foo/bar foo/bar/../.. -> <- Should be . (?) /foo/bar/../.. -> / <- Fixed /foo/bar/../../moo -> /moo <- Still works {...}/etc/passwd -> /etc/passwd <- Still works
Re: Cleaning up a path
by polettix (Vicar) on Nov 18, 2006 at 12:20 UTC
    While a little out of the target of my OP, I've found that Cwd actually has something that really cleans up a path dealing with all the symlink issues:
    #!/usr/bin/perl use strict; use warnings; use Cwd 'abs_path'; my $somepath = "/var/log/../../home/poletti/../../etc/passwd"; print "starting path: [$somepath]\n"; print "abs_path : [", abs_path($somepath), "]\n" __END__ starting path: [/var/log/../../home/poletti/../../etc/passwd] abs_path : [/etc/passwd]
    According to the docs:
    abs_path my $abs_path = abs_path($file); Uses the same algorithm as getcwd(). Symbolic links and re +lative- path components ("." and "..") are resolved to return the c +anonical pathname, just like realpath(3).
    This probably requires that the file actually lives in the filesystem, but most of the time it's what one wants. Thank you all for the contributions, anyway :)

    Flavio
    perl -ple'$_=reverse' <<<ti.xittelop@oivalf

    Don't fool yourself.