in reply to Munging file name, to be safe- & usable enough on Unix-like OSen & FAT32 file system

Ooh - also looking at PRN and CON, my module Text::CleanFragment doesn't do that! Tt uses Text::Unidecode to downgrade accented text (etc) to something ASCII, since I don't trust cross-filesystem consistency for non-ASCII :)

  • Comment on Re: Munging file name, to be safe- & usable enough on Unix-like OSen & FAT32 file system
  • Select or Download Code

Replies are listed 'Best First'.
Re^2: Munging file name, to be safe- & usable enough on Unix-like OSen & FAT32 file system
by parv (Parson) on Nov 26, 2023 at 09:36 UTC

    Thanks much for linking Text::CleanFragment & Text::Unidecode. First one seems to do everything that I would do for <Windows file systems that are not NTFS>. It would be perfect to replace my code if degradation of Unicode could be optionally skipped on file systems where Unicode is non-issue.

    I wanted to use icu or similar (or, the second module) to preserve some information; due to lack of time (motivation, really) never looked into that. The second one may solve that issue for me. Sweet👍🏽

    Oh, there is Unicode::ICU💡 So should create an another version that first tries Unicode::ICU; failing that, Text::Unidecode. (ugh no, no, do not want to go further down this rabbit hole for cannot help but think of Tom C's response on Stack Overflow (on WayBack Machine aka web.archive.org))