I don't think Perl should implement a VFS. Simply because there are so many filesystems around, and as far as I understand your idea, that would require one module per filesystem. wc -l /prof/filesystems on a random Debian box shows 33 filesystems, including FUSE, which may be used to implement many more filesystems, perhaps even homegrown ones. Also, you need to know which filesystem is mounted in each and every directory, and you will probably also need to know the mount options (Linux can mount FAT and friends with different codepages, see mount).
I would like to see a different approach: A (maybe highly magical) use unicodepaths; that makes all filesystem functions (limited to a scope) accept and return Unicode strings.
As far as a I understand Windows, this would essentially mean to switch from the legacy ANSI API to the Wide (Unicode) API. Windows would perhaps be a good testbed for that switch, as it has an API that explicitly expects and returns Unicode.
For Linux and other Unix systems, some more thinking is needed. You basically need to know if the filenames are just bytes or if they are encoded in UTF-8.
Perhaps just guessing and trying to convert may work good enough for Unix:
Any filename returned from the operating system should be treated as bytes, unlesss unicodepaths is active. If unicodepaths is active, try to decode the bytes as UTF-8. If that succeeds, use the result as Unicode string. If that fails, keep the bytes as-is, and don't set the UTF-8 flag on the returned filename.
Any filename passed to the operating system should be encoded to a UTF-8 byte stream if unicodepaths is active and the filename has the UTF-8 flag set. If unicodepaths is not active and/or the filename has the UTF-8 flag cleared, no encoding should happen. If unicodepaths is not active but the filename has the UTF-8 flag set, a warning should be issued.
(That warning does not seem to happen on my Debian box: perl -w -E '$fn="x\x{ABCD}"; open my $f,">",$fn; say $f "hi"; close $f;' does not warn at all. Perl is v5.32.1 for x86_64.)
Both combined should allow Perl to see Unicode where Unicode happens, while not messing with the encoding for non-Unicode filenames.
Maybe this idea needs some more relaxed encoding of UTF-8 to allow a round-trip of any random bytes in a filename.
Maybe this idea needs to split paths and handle each element of the path separately.
Alexander
In reply to Re: What would you like to see in a Virtual Filesystem for Perl?
by afoken
in thread What would you like to see in a Virtual Filesystem for Perl?
by NERDVANA
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |