http://qs1969.pair.com?node_id=67760

mkmcconn has asked for the wisdom of the Perl Monks concerning the following question:

My work permits me to use Linux and FreeBSD most of the day, which I'm happy for just because it's fun. But, there are many things more easily done in Windows - like, working in Windows. So, this is not a Windows-bashing post, even though it may provide some amusement to Microsoft haters.

I am embarrassed by what I'm about to show you, and proud of it at the same time. It's a workaround (in the bad sense) for one of the silliest day-to-day problems I face: Reassociate the "DOS" filename with the Long File Name, in Windows. It's a silly problem, but I want to tell you how I approached it and invite your better solutions.

Some backround:
A government ( read: "would-try-to-solve-this-problem-if-not-for-under-budgeting" ) vendor, provides files for us, zipped into self-extracting archives. The images are indexed similarly to this:

#Year Doc_Num Image_File 2001 20233 E:\TEMP\IMAGES\2001_020233.tif 2001 20234 E:\TEMP\IMAGES\2001_020234.tif # etc...

All the names of the files in this index correspond to the files in the zipped archive. However, the vendor's software, at some point between production and delivery, does not support Long File Names. Thus, the files in the archive display only an 8.3 name - the LFN has been disassociated from the file. The vendor has also shared this program with other agencies. Consequently, we see this problem with increasing frequency.

The task is, reassociate the 8.3 filename with the LFN listed in the index, so that the index can be used and we can avoid renaming the files by hand. Here is part of the code that does that:

use strict; use Win32; use File::Basename qw(fileparse basename); use CGI qw(pretty); $|++; my $out = new CGI; my ( $directory, $tempdir, $index_file); # ... snip ... sub make{ mkdir $tempdir; open FILE, "< $index_file" or die $out->p("$index_file: $!"); open OUT, "> $index_file.result" or warn $out->p("$!\n"); print $out->start_p(), $out->br("\t", $out->a({-href=> "file://$index_file.result"}, "$index +_file.result"), "OPENED\n"); my $incr = 0 ; while (my $long = <FILE>){ $long =~ s/^.*(\b\w+_\w+\.\w+)\s*/$1/ or next; $long = "$tempdir/$long"; open NEWFILE, "> $long"; close NEWFILE; my $short = Win32::GetShortPathName($long); $long =~ s/^.*(\b\w+)_(\w+\.\w+)/$1$2/; $short =~ s/^.*(\b\w+~\w+\.\w+)/$1/; print OUT $incr++,", $short, $long\n"; } print $out->end_p(); close OUT; close FILE; }

Yes, if you are still reading, what this sub does is, create a directory full of zero-byte files named according to the index file. The program then reads the filenames to get the Short Path supplied by the system. Another sub destroys these temporary files and the produced log ( OUT, when the work is completed.

Why such a roundabout route? Well, as all you 12th level mages know, and as I've only recently found out: Microsoft has no less than three completely different LFN => 8.3 conversion algorithms for their three most common operating systems.
Under Windows 98:

2001_020674.tif => 2001_0~1.TIF ... 2001_020677.tif => 2001_0~4.TIF 2001_020678.tif => 2001_0~5.TIF


Under Windows NT:
2001_020674.tif => 2001_0~1.TIF ... 2001_020677.tif => 2001_0~4.TIF 2001_020678.tif => 204EFD~1.TIF


Under Windows2000:
2001_020674.tif => 2001_0~1.TIF ... 2001_020677.tif => 2001_0~4.TIF 2001_020678.tif => 208483~1.TIF

So, you see, the 8.3 name is created using a different algorithm, depending on the system.
Windows 98 may have names like 20~36009.TIF, but NT and 2000 create names with only one character to the right of the tilde.

All the systems will attempt to name the first file 2001_0~1.TIF. Spotting a conflict with that name, they will all manufacture a non-conflicting name in the same pattern, up to the fourth conflicting name. Then, the disimilarity in the algorithm appears, and from then on the name conflicts are resolved with completely different results. I would not be at all surprised if there are as many results as there are different Windows, but I don't know this.

So, a question/challenge to close, since I chose to post this in Seekers of Perl Wisdom:
If you have some simple way to resolve this problem, then I would be most happy to read it.

Especially, if you know what the algorithm is that handles this for each version of Windows, perhaps you can point it out to me and I can try to translate it into Perl(it may be in windows.h - I am a poor reader of C and C++ , but I don't have that header file to try).

I hope you find this at least amusing, or even educational, as I have.
mkmcconn