perlhuhn has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks,

I am writing a program that needs to open directories with the O_NOFOLLOW flag. Since opendir() does not support that, I use sysopen():

  sysopen my $dirh, $pathname, O_RDONLY|O_DIRECTORY|O_NOFOLLOW;
Unfortunately, readdir() does not accept the handle from sysopen:
  readdir $dirh;

readdir() attempted on invalid dirhandle $dirh
Is there a way to "convert" the handle from sysopen to one accepted by readdir? I have a work-around with chdir() but that's pretty ugly (error handling not included here):
  my $cwd = getcwd();
  sysopen my $dirh, $pathname, O_RDONLY|O_DIRECTORY|O_NOFOLLOW;
  chdir $dirh;
  opendir my $dirh2, '.';
  my @files = readdir $dirh2;
  closedir $dirh2;
  chdir $cwd;

Replies are listed 'Best First'.
Re: readdir() on a sysopen() handle?
by afoken (Chancellor) on Aug 20, 2017 at 13:05 UTC

    Looking through several linux man pages, it looks like you normally should use opendir, readdir or scandir, and closedir from C. Those functions are specified by POSIX and are portable. But the glibc also offers fdopendir that converts a plain integer file descriptor to a DIR *. So in C, something like this should work:

    /* UNTESTED! */ DIR * d opendir_nofollow(const char * pathname) { int fd = open(pathname, O_DIRECTORY | O_NOFOLLOW); if (fd == -1) { return NULL; } return fdopendir(fd); }

    Converting that to a perl directory handle will very likely require a little bit of XS code. Perhaps Inline::C might be helpful. You definitively want to have a look at the perl sources, the part that implements the opendir function, to see how to correctly create a directory handle.

    Alexander

    --
    Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

      As you wrote this I was actually fiddling with exactly that :-) The relevant Perl source appears to be pp_open_dir in pp_sys.c, which uses the IoDIRP macro, which apparently accesses the DIR * xiou_dirp slot of struct xpvio, but I can't seem to find any more documentation on it.

      Disclaimer: I am not an XS expert, I can't guarantee that the following is entirely correct! I got some of this from the Inline::C::Cookbook, a bit of research in perlapi, and a bit of fiddling...

      myreaddir does all of the work of opening and reading the directory in C, returning a Perl list, while _xs_myfdopendir with the Perl wrapper myfdopendir attempts to be a custom opendir.

      use warnings; use strict; use Inline C => <<'END_OF_C'; void myreaddir(SV* sv_dirn) { Inline_Stack_Vars; Inline_Stack_Reset; int fd = open( SvPVx(sv_dirn, PL_na), O_RDONLY|O_DIRECTORY|O_NOFOLLOW); if (fd<0) Inline_Stack_Return(0); DIR* dir = fdopendir(fd); if (dir==NULL) Inline_Stack_Return(0); struct dirent *dp; while ( (dp=readdir(dir)) != NULL ) Inline_Stack_Push(sv_2mortal( newSVpvf("%s", dp->d_name) )); if( closedir(dir)!=0 ) Inline_Stack_Return(0); Inline_Stack_Done; } int _xs_myfdopendir(SV* sv_dirn, SV* sv_hnd) { int fd = open( SvPVx(sv_dirn, PL_na), O_RDONLY|O_DIRECTORY|O_NOFOLLOW); if (fd<0) return 0; DIR* dir = fdopendir(fd); if (dir==NULL) return 0; IoDIRP(sv_2io(sv_hnd)) = dir; return 1; } END_OF_C use Symbol qw/geniosym/; use File::Spec; sub myfdopendir { return unless _xs_myfdopendir( $_[0]//File::Spec->curdir, my $dh=geniosym ); return $dh; } use Data::Dump; my @x = myreaddir('/tmp') or die $!; dd @x; my $dh = myfdopendir('/tmp') or die $!; dd readdir $dh; closedir $dh or die $!;

      Update: A couple of Perl modules that use XS to read directories, in particular the first one's readdir_hashref looks like it could be modified fairly simply: ReadDir, IO-Dirent, PerlIO-Util

         or die $! is wrong in my @x = myreaddir('/tmp') or die $!; because an empty list doesn't necessarily denote an error.

Re: readdir() on a sysopen() handle?
by haukex (Archbishop) on Aug 20, 2017 at 11:16 UTC

    I am not an expert on the underlying C API, but I looked into this a bit out of curiosity... but I haven't yet been able to find any examples of whether it is even possible to readdir(3) a directory opened with open(2) instead of opendir(3)? On *NIX systems, the Perl API mirrors the C API closely, and if it's not possible with C, Perl isn't going to be able to do this either - at least not natively, perhaps there are some modules that use XS and can access other APIs provided by the OS, like the openat(2) and related functions.

    One reference I found was an older version of the DJGPP manual, which explicitly says (edited for brevity): "You can open directories using open, but there is limited support for POSIX file operations on directories. The principal reason for allowing open to open directories is to support changing directories using fchdir. If you wish to read the contents of a directory, use the opendir and readdir functions instead." This seems to be exactly what your "workaround" is doing. There's also the file chdir-safer.c from gnulib which appears to use the fchdir technique in the function chdir_no_follow.

      Thanks for the reference to the opendir manpage. It mentions fdopendir(3) which would do what I need but it doesn't seem to be supported by Perl.
Re: readdir() on a sysopen() handle?
by Laurent_R (Canon) on Aug 20, 2017 at 09:52 UTC
    May be you can use opendir and then filter out the symbolic links when reading the directory with readdir .
      Such a filter would have to use stat() to determine if an entry is a diretory before opening it. The problem is that an entry might change from a directory to a symbolic link between the stat() and the open(). O_NOFOLLOW prevents such race conditions.
        The problem is that an entry might change from a directory to a symbolic link between the stat() and the open().

        Wouldn't a second stat() after the open tell? Well duh, the underlying file could just switch back from symlink to directory between the open() and the second stat, e.g. something that emulates a directory via a maliciously loaded file system module doing sinister things. Just curious - what problem are you trying to solve?

        Correct me if I am wrong, but after getting a handle to something, even if the something is renamed, deleted, and symlinked back, it holds to the original structure being accessed:

        my $path = '/tmp/open'; -d $path and die "remove $path first\n"; mkdir $path; for (qw(foo bar quux)) { open my $fh, '>',"$path/$_"; } mkdir "$path/baz"; for (qw(blorf blorfldyick)) { open my $fh,'>', "$path/baz/$_"; } opendir my $dh1, $path; while(readdir $dh1) { next if /^\.\.?$/; print "read(dh1): $path/$_\n"; if (-d "$path/$_") { opendir my $dh2, "$path/$_" or die; # emulate external change directory to symlink rename "$path/$_","$path/fie"; symlink "$path/fie", "$path/$_" or die; # end emulate if(-l "$path/$_") { print "bogus change to $path/$_:\n"; print " $path/$_ points to ",readlink "$path/$_","\n"; } while (my $e = readdir $dh2) { next if $e =~ /^\.\.?$/; print "read(dh2): $e\n"; } } } __END__ read(dh1): /tmp/open/foo read(dh1): /tmp/open/quux read(dh1): /tmp/open/baz bogus change to /tmp/open/baz: /tmp/open/baz points to /tmp/open/fie read(dh2): blorf read(dh2): blorfldyick read(dh1): /tmp/open/bar

        Side note which might resolve this XY Problem (if so): -d on a symlink returns true up to v5.25.10, so -d resolves symlinks, which it shouldn't do. IMHO this is a bug.

        Apropos race condition: I can't think of anything which would resolve that, other than a system call like openif() into which the expected type is passed as an argument.

        perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'
        The problem is that an entry might change from a directory to a symbolic link between the stat() and the open(). O_NOFOLLOW prevents such race conditions.
        I fail to see why or how O_NOFOLLOW would prevent this from happening if you were thinking about doing a sysopen followed by a readdir or anything more or less equivalent with a different system call.

        Maybe you should explain more precisely what you're really trying to do.

Re: readdir() on a sysopen() handle?
by Marshall (Canon) on Aug 21, 2017 at 01:12 UTC
    I am confused as to what application behavior you are trying to prevent and exactly what your application is?

    A file system is like a continually evolving biological organism. There can be incestuous liaisons between family groups (symlinks).

    The directory structure correlates "textual names" to "structures of bits" which are called "files".

    When you do something like a readdir(), you get an imperfect snapshot of the "family tree" of textual names. It is completely possible to get a filename from a readdir() which can't be opened because it doesn't exist anymore once you actually try to open that textual name because some other process has deleted that name in the meantime.

    Depending upon the O/S and the type of file, it is possible to read a directory (which produces textual names), open a file (which resolves to a binary filehandle (independent of the text name)), and continue to use that file while the textual name is deleted from the directory. That situation means that one or many programs continue to use the "file" although no new program can open it because its "textual name" no longer exists.

    If you get to a file and actually open that file via a symlink, that file is open for use, even if the symlink is deleted (textual representation is deleted).

    I like the first post by Laurent_R. If you don't want to follow a directory symlink, don't open it if it is one. I guess you can check if that directory name is still not a symlink once you open it, but all sorts of strange thinks can still happen.

    It would be helpful if you explained a bit more about what your applications does and how it handles failed directory or file "opens".

      The purpose of the program is to write data files into a specific subdirectory of the users' home directories, e.g. /home/username/datadir/datafile.timestamp.txt.

      datadir is only writable by the program and readable by the user. But since it's inside the user's home directory the user could rename it an replace it with a symlink or re-create it and put a symlink with the datafile name inside.

      Of course, the obvious solution is to change the filesystem layout but that is currently not an option. So the program needs to open the directory and the data file with O_NOFOLLOW to avoid writing to the wrong places.

      The desired behavior when encountering a symlink is to refuse writing the data and produce a warning message. This case is rare enough that it's not too much hassle.

      The readdir() part is just a minor issue and it might get removed in the future but it feels a bit clumsy right now. And since fdopendir() is part of POSIX.1-2008 one might hope to find it in a current Perl version.

      Anyway, thanks for all your replies. I guess I'll put up with the chdir() solution.