This is the long version of Re^2: open file error and Re^5: Unify windows filenames (from ten years ago), triggered by Re^2: perl script to compare two directories, plus a little bit of bean counting on the code for non-Unix systems.
File::Spec has a severe design problem that will cause wrong results on modern Unix systems (*BSD, Linux, MacOS X). It assumes that all filesystems are equal on any Unix operating system. And this assumtion is plain wrong since decades:
To make things worse, the code for at least Windows and cygwin is broken, too.
package File::Spec; use strict; our $VERSION = '3.75'; $VERSION =~ tr/_//d; my %module = ( MSWin32 => 'Win32', os2 => 'OS2', VMS => 'VMS', NetWare => 'Win32', # Yes, File::Spec::Win32 works on Ne +tWare. symbian => 'Win32', # Yes, File::Spec::Win32 works on sy +mbian. dos => 'OS2', # Yes, File::Spec::OS2 works on DJGP +P. cygwin => 'Cygwin', amigaos => 'AmigaOS'); my $module = $module{$^O} || 'Unix'; require "File/Spec/$module.pm"; our @ISA = ("File::Spec::$module"); 1;
For any modern Unix, the %module hash has no overriding key, and so File::Spec::Unix will implement all methods of File::Spec.
package File::Spec::Unix; use strict; use Cwd (); our $VERSION = '3.75'; $VERSION =~ tr/_//d; # ... sub case_tolerant { 0 } use constant _fn_case_tolerant => 0; # ...
File::Spec->case_tolerant() constantly returns false, and that is plain wrong, as explained above.
The constant _fn_case_tolerant, used by File::Spec::Functions, is also wrong. It should not exist at all.
Strangely, the Windows implementation, which optionally allows passing a drive letter to the case_tolerant() method, is better, but still wrong:
package File::Spec::Win32; use strict; use Cwd (); require File::Spec::Unix; our $VERSION = '3.75'; $VERSION =~ tr/_//d; our @ISA = qw(File::Spec::Unix); # ... sub case_tolerant { eval { local @INC = @INC; pop @INC if $INC[-1] eq '.'; require Win32API::File; } or return 1; my $drive = shift || "C:"; my $osFsType = "\0"x256; my $osVolName = "\0"x256; my $ouFsFlags = 0; Win32API::File::GetVolumeInformation($drive, $osVolName, 256, [], [] +, $ouFsFlags, $osFsType, 256 ); if ($ouFsFlags & Win32API::File::FS_CASE_SENSITIVE()) { return 0; } else { return 1; } } # ...
package File::Spec::Cygwin; use strict; require File::Spec::Unix; our $VERSION = '3.75'; $VERSION =~ tr/_//d; our @ISA = qw(File::Spec::Unix); # ... sub case_tolerant { return 1 unless $^O eq 'cygwin' and defined &Cygwin::mount_flags; my $drive = shift; if (! $drive) { my @flags = split(/,/, Cygwin::mount_flags('/cygwin')); my $prefix = pop(@flags); if (! $prefix || $prefix eq 'cygdrive') { $drive = '/cygdrive/c'; } elsif ($prefix eq '/') { $drive = '/c'; } else { $drive = "$prefix/c"; } } my $mntopts = Cygwin::mount_flags($drive); if ($mntopts and ($mntopts =~ /,managed/)) { return 0; } eval { local @INC = @INC; pop @INC if $INC[-1] eq '.'; require Win32API::File; } or return 1; my $osFsType = "\0"x256; my $osVolName = "\0"x256; my $ouFsFlags = 0; Win32API::File::GetVolumeInformation($drive, $osVolName, 256, [], [] +, $ouFsFlags, $osFsType, 256 ); if ($ouFsFlags & Win32API::File::FS_CASE_SENSITIVE()) { return 0; } else { return 1; } } # ...
File::Spec::AmigaOS inherits case_tolerant() from File::Spec::Unix, so it returns false. I have no clue how AmigaOS handles file names. AmigaOS states that device names are case insensitive, so case_tolerant() should return true for device names. IIRC, old Amigas could at least read DOS floppies. Again, case_tolerant() should return true for DOS floppies. So, it looks like File::Spec::AmigaOS::case_tolerant() is broken.
File::Spec::Epoc implements case_tolerant() to constantly return true. Again, I have no clue how EPOC a.k.a. Symbian OS handles file names.
File::Spec::Mac implements case_tolerant() to constantly return true. This should be ok, the filesystems of classic MacOS (MFS, HFS, HFS+) are case insensitive, as are FAT-formatted floppies from DOS. I don't know if old Macs could access any other filesystems.
File::Spec::VMS implements case_tolerant() to constantly return true. Again, no clue how VMS handles file names, or if it can mount case-sensitive filesystems. Files-11 suggest case insensitive behaviour. If case-sensitive filesystems can be mounted, this implementation is broken.
File::Spec::OS2, used also for DOS, implements case_tolerant() to constantly return true. This should be ok for DOS and OS/2 native filesystems (FAT, HPFS). I don't know if later versions of OS/2 support case-sensitive filesystems. If they do, this implementation is broken.
package File::Spec::Functions; # ... our $VERSION = '3.75'; $VERSION =~ tr/_//d; #... my %udeps = ( # ... case_tolerant => [], # ... ); foreach my $meth (@EXPORT, @EXPORT_OK) { my $sub = File::Spec->can($meth); no strict 'refs'; if (exists($udeps{$meth}) && $sub == File::Spec::Unix->can($meth) +&& !(grep { File::Spec->can($_) != File::Spec::Unix->can($_) } @{$udeps{$meth}}) && defined(&{"File::Spec::Unix::_fn_$meth"})) { *{$meth} = \&{"File::Spec::Unix::_fn_$meth"}; } else { *{$meth} = sub {&$sub('File::Spec', @_)}; } } # ...
There seems to be no easy fix for Unix. Returning false is the wrong answer for all of the edge case shown above. Returning true is the wrong answer for all of the common cases (native filesystems).
At least, case_tolerant() needs a path to work on, as there is no generic answer on Unix. The path should be the equivalent of a drive letter on windows, i.e. a mount point.
For convenience, passing a filename should be treated like passing the directory containing it. (This should happen for all operating systems)
Also for convenience, passing a directory that is not a mount point should be treated like passing the next mount point upwards the directory tree. (This should also happen for all operating systems.)
Detecting a mount point depends on the operating system, but a general solution for Unix exists: compare the dev fields of stat($dir) and lstat("$dir/.."), if they differ, you found a mount point. Note that this method fails to detect bind mounts on Linux (according to mountpoint.c from util-linux).
Knowing the mount point for a filesystem, you "only" have to find out if that filesystem is mounted case-sensitive or case-insensitive. And that depends on the operating system.
For bug-compatibility to the existing File::Spec::Unix, calling case_tolerant() without arguments could continue to return false.
case_tolerant() should also accept directories and files deep inside a volume, as required for the fixed File::Spec::Unix->case_tolerant()
As far as GetVolumeInformation() is concerned, mount points are either drive letters or server shares. That can easily be handled by a regexp.
Detecting NTFS volume mount points needs more work. See https://docs.microsoft.com/en-us/windows/win32/fileio/volume-mount-points.
GetVolumeInformationByHandleW() looks promising, but requires at least Vista / Server 2008, and it requires that you pass a file handle, not just a path.
case_tolerant() should also accept directories and files deep inside a volume, as required for the fixed File::Spec::Unix->case_tolerant()
Especially, do not check for Cygwin or Cycwin functions before you know you have to work with a cygwin path.
Alexander
|
---|