in reply to utf8 in directory and filenames

Here is a tip I got from graff, seeRe: problems with extended ascii characters in filenames

Summary:

#this decode utf8 routine is used so filenames with extended # ascii characters (unicode) in filenames, will work properly use Encode; opendir my $dh, $path or warn "Error: $!"; my @files = grep !/^\.\.?$/, readdir $dh; closedir $dh; # @files = map{ "$path/".$_ } sort @files; #$_ = decode( 'utf8', $_ ) for ( @files ); @files = map { decode( 'utf8', "$path/".$_ ) } sort @files;

I'm not really a human, but I play one on earth. Cogito ergo sum a bum

Replies are listed 'Best First'.
Re^2: utf8 in directory and filenames
by Juerd (Abbot) on Nov 13, 2006 at 17:26 UTC

    Note that the result from decode is a text string, and should never be used as a filename. It's good for displaying the filename to human beings, but not for actually opening the file or storing the filename. When that poses a problem, because the filename must be stored in a text document that's actually meant for computers, consider finding a way to encode the bytes to an ASCII-compatible format, like with URI-escaping or quoted printable.

    Juerd # { site => 'juerd.nl', do_not_use => 'spamtrap', perl6_server => 'feather' }

Re^2: utf8 in directory and filenames
by tqviet (Acolyte) on Feb 04, 2023 at 05:21 UTC
    Thank you so much. I have used your code. It works perfectly.
Re^2: utf8 in directory and filenames
by soliplaya (Beadle) on Nov 13, 2006 at 17:03 UTC
    Thank you.
    I get the idea about "decoding" the byte string, so that Perl would know that this is to be treated as utf-8.
    What I do still not quite get, is why the "-f" and the open() do fail when used on the original entry resulting from the readdir(). Anyone have an idea ?

    (I'm also not quite sure if I'm pushing the right button to send this answer, but I guess I'll find out)