in reply to Normalized directory paths

Here's a complete program, demonstrating the correct solution given above:
#!/usr/bin/perl -w use strict; my $path = "/foo/mik/../mik/./../mik"; # print "Path is --$path--\n"; $path =~ s!/\.([^.])!$1!g; # print "Path is now --$path--\n"; while ($path =~ m!/\.!) { $path =~ s!/[^\/]+/\.\.!!; # print "Path is now -+$path+-\n"; } print "Ended up with =>>$path<<=\n";
First, we get rid of /., because that won't take us anywhere. Next we loop while there's a dot followed by a forward slash. As /.. takes us up a directory, we look for a valid directory followed by that construct, and get rid of the whole thing. Eventually, the loop has to fail, and we must have come up with something.

merlyn and turnstep are correct, though -- doing this sort of thing with regular expressions is pretty easy to break. Unless you have *complete* control of the directories being passed to your script and in the filesystem, don't use this.

Replies are listed 'Best First'.
RE: Re: Normalized directory paths
by mikfire (Deacon) on May 17, 2000 at 00:15 UTC
    #!/usr/bin/perl -w use strict; my $path = "/foo/mik/../mik/./../mik"; #print "Path is --$path--\n"; $path =~ s!/\.([^.])!$1!g; #print "Path is now --$path--\n"; while ($path =~ m!/\.!) { $path =~ s!/[^\/]+/\.\.!!; # print "Path is now -+$path+-\n"; }
    Forgive me, but this breaks when $path = "/foo/mik/.hidden". The first regex results in
    Path is now --/foo/mikhidden--

    Mik

Re^2: Normalized directory paths
by znik (Initiate) on Feb 16, 2016 at 08:02 UTC

    the worst situation is with replacing /somedir/../ . unfortunately pattern [^/]+/../ does not work as it should. please imagine situation, source path is /../../ and ..... this pattern matches! for complete clean and normalize input path i suggest this procedure:

    $path='./anything/../.../something'; #example $path=~s!/+!/!g; #replace //// by single / $path=~s!^\./!!; #remove starting ./ $path=~s!/\.(?=$|/)!!g; #remove all /. 1 while $path =~ s!/([^/]{3,}|[^/.][^/]*|\.[^/.])/\.\.(?=/|$)!!g; #re +move /something/.. $path=~s!^([^/]{3,}|[^/.][^/]*|\.[^/.])/\.\./!!; #remove starting som +ething/../ $path='.' if $path eq ''; #point current path if finally it is empty #at this place $path is normalized

    you can wear this code in some function :) for clarification what for is sequence ([^/]{3,}|[^/.][^/]*|\.[^/.]) ? this is something special. this matches to everything names except names that contains '/' character, and does not match to '..' . testing to matching single '.' is unneeded because this has been removed previously. notice, this path fragment matches to '...' and more dots, because only single and double dots are reserved. this provices file names like '.something', usually used as hidden names in unix like systems.

    Thank you for the congratulations and I am happy if this piece of code will be help for someone :) I know, I invented wheel again :)

RE: Re: Normalized directory paths
by turnstep (Parson) on May 17, 2000 at 00:51 UTC
    This enters an infinite loop if you have something like this:
    $path = "/.../up/.../down/../allaround";