I have too many projects as it is, but I keep coming back to the idea that Perl ought to have a universal "virtual filesystem" module to abstract away some details of the platforms it runs on. There are a lot of ways this could go, but I have two main itches to scratch:

  1. Seamless support for Unicode file names in a Path::Class-like API.
  2. The ability to work with filesystems that may be backed by real files or by emulated filesystems, i.e. browsing zip files, ftp, webdav, iso9660, and so on, and the ability to merge them together like mounts on Linux, but without needing elevated privileges to the host system.

As it happens, there is a great CPAN namespace "VFS" that a similar-minded person uploaded in 2004 and then never finished an implementation of. I've reached out to him and it seems he might be open to the idea of handing it off to me to finish. Negotiations are ongoing.

But, before I touch such a great namespace, I'd like to collect ideas from more minds than just my own! Here are some important points that I am considering:

Unicode Filenames

On UNIX, filenames are just bytes. Unix people added unicode support through the use of "Locale" features, so that unicode-aware programs could try decoding the filenames according to the locale, but Perl does not respect the locale and always returns bytes from readdir / glob / readlink / getcwd. Also, in Perl, if you take a filename that is bytes which happen to be valid UTF-8, and then append Unicode to that string, the resulting string will not be usable as a filename. (it will flatten to bytes with a warning, but double-encode the high bytes you read from readdir, so the directory won't exist)

On Windows, Perl uses the ascii API rather than the wide-character API, but the bytes you get from readdir are dependent on the Windows Code Page. This can work if the program is configured to run in the UTF-8 codepage, but that is almost never the default, so most people get garbage when they read unicode filenames under Windows, and have to do a lot of studying before they can make it work. If you do have the utf-8 codepage, it still leaves you with the mess that you would have on Unix.

There are other filesystems where path names belong to known character sets, and not left to guessing with locales. For instance with iso9660 you know from the metadata which character set is being used, and Locale doesn't enter into it. A module that walks a iso9660 filesystem should always be understood to return unicode names, and not get tangled up with the program's Locale settings.

Proposal: While I might like a mode in Perl where readdir() returns Unicode, I suspect doing that on a global basis would break things too much, so I think a better solution is to have a Path::Class / Path::Tiny themed module where it is understood that all names given and returned will properly respect unicode. By using this module, authors can be assured that their code will work properly when presented with non-ascii directory and file names, and work cross-platform.

Virtual Filesystems

There are lots of great reasons for wanting virtual filesystems in the host, like FUSE modules, but why should we have them inside Perl?

Proposal:
To deal with all of these, I think the virtual filesystem should have independent filesystem objects, so they aren't all interconnected by default, and then an optional ability to use one of them to override the core perl IO operations. Each filesystem should have the ability to mount other filesystems at arbitrary paths, and each should have the ability to derive a "chroot" filesystem from an arbitrary path.

Problems

Prior Work

I'm not the first one with this idea, of course. So far, I've found:

What Am I Forgetting?

So, if you made it through all of that, what I'm looking for are ideas! What am I forgetting? What other features would you like to see? What do you feel are deficiencies in the current popular path modules like Path::Class or Path::Tiny? Should I just be building on some other CPAN module?

Also, I wrote a rough draft of the POD for such a module at https://github.com/nrdvana/perl-VFS/blob/main/lib/VFS.pm


In reply to What would you like to see in a Virtual Filesystem for Perl? by NERDVANA

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.