in reply to Tk causes problems with file paths containing non-Latin-1 chars

I ran into this problem awhile back with Tk, and this is the solution I was handed. You need to manually decode all the filenames to ensure the utf8 flag gets set in the filesystem. Don't ask me to explain all the details, of how utf8 flags get set or ignored. :-)
#this decode utf8 routine is used so filenames with extended # ascii characters (unicode) in filenames, will work properly use Encode; opendir my $dh, $path or warn "Error: $!"; my @files = grep !/^\.\.?$/, readdir $dh; closedir $dh; @files = map { decode( 'utf8', "$path/".$_ ) } sort @files;

Tk still works, but the active developer died a few years ago. If you want or need the kind of robust filesystem reading which you described, switch to Gtk2. It is very active, and bug fixes come in a matter of weeks.


I'm not really a human, but I play one on earth.
Old Perl Programmer Haiku ................... flash japh
  • Comment on Re: Tk causes problems with file paths containing non-Latin-1 chars
  • Download Code

Replies are listed 'Best First'.
Re^2: Tk causes problems with file paths containing non-Latin-1 chars
by ron7 (Beadle) on May 03, 2011 at 22:50 UTC
    Zentara: Your Encode::decode utf8 solution (which I found during experiments) is a fix for Linux and OS-X, but does not work for win32 (testing under XP only, no idea if Vista/W7 are different, but I can't see why they would be). For win32, the filename treatment required is
    @files = map { pack 'UW*', unpack 'C*', $_; } @files;
    The resultant string will now open and respond as expected to -d, -f, etc, but is no longer printable, or regex processable due to the "long char" problem. So I have to keep two copies, of the name: one for processing, one for display.

    All that can be coded up via a simple package that decides:

    • Are we running inside Tk?
    • If so, is the os Windows?

    I actually started doing that and the code got so horrible, I gave up took the coward's way out: added a known bug/limitation for Windows users (actually it was the inherent bug with Tk::FBox->getOpenFile barfing that made me give up)

    [Updated to show who and what I'm replying to as the post is out of place for some reason].

Re^2: Tk causes problems with file paths containing non-Latin-1 chars
by vkon (Curate) on May 03, 2011 at 18:04 UTC
    it appeared that perl/Gtk dropped ATUOLOAD mechanics during GTk binding, and this means that huge amount of perl subroutines is created at startup, which are mostly never used, just pollute symbol table.

    This is a shame, I can not believe people program this way nowadays.
    Perl is fast, convenient, and doing huge amount of needless work at startup is just not the way to go.

    Developers said they feel autoloading is not very stable, and - what they decided? let us, users, pay for their inability to efficiently program?

    This is just not acceptable.

Re^2: Tk causes problems with file paths containing non-Latin-1 chars
by vkon (Curate) on May 03, 2011 at 18:39 UTC
    .... I want to add that probably it is not very correct to emphasize on what have happened on author of perl/Tk.
    Moreover, perl/Tk is supported now and gets new releases.
    Maintaining perl/Tk is difficult given the implementation specific of the module (indeed, I would recommend to switching to any 3 other Tk modules)

    but your current sentence could be easily misread
    - current active maintainer is well and hopefully healthy.