Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi there,

Using perl v5.10.1 (build 1006) built for MSWin32-x86-multi-thread on Windows XP, I am working on a simple Perl script to find pdf files from a Windows folder.
However, I spent already 2 days trying to figure out why I would sometimes have files with their correct names and sometimes file names with the DOS 8.3 format.

What I have discovered so far is that some files contains the Unicode Character 'MINUS SIGN' (U+2212) in their name and all of them can't be read correctly by Perl subroutines and are changed to DOS 8.3 filenames.

For instance: COPYOF~1.PDF is displayed whereas the file name is actually "Copy of PCO-1810.pdf" (minus sign is different than the standard ASCII/Windows dash, might come from MAC users but I am not sure).


I have tried almost everything I could understand about this problem:
_ used chdir/readdir,
_ used File::find,
_ used <*.pdf> on a folder handle,
_ tried to convert output from unicode to utf8 or latin1,
_ tried to use perl -CSDA to launch the script...

Do you guys have any information why the DOS 8.3 file names are displayed whereas a simple "dir" on the folder in windows DOS prompt display all files without problem?
Is it a bug or am I missing something?

Thanks a lot in advance for your help.
Regards.
Azulito
  • Comment on DOS 8.3 filenames output when filenames contain Unicode Character 'MINUS SIGN' (U+2212)

Replies are listed 'Best First'.
Re: DOS 8.3 filenames output when filenames contain Unicode Character 'MINUS SIGN' (U+2212)
by Sandy (Curate) on Dec 08, 2009 at 15:02 UTC
    Hello

    I can't reproduce your problem, so I don't know if this will work, but it's worth a shot

    Win32

    Win32::GetLongPathName(PATHNAME)

    CORE: Returns a representation of PATHNAME composed of longname components (if any). The result may not necessarily be longer than PATHNAME. No attempt is made to convert PATHNAME to the absolute path. Compare with Win32::GetShortPathName() and Win32::GetFullPathName().

    This function may return the pathname in Unicode if it cannot be represented in the system codepage. Use Win32::GetANSIPathName() before passing the path to a system call or another program.

    Sandy