astroboy has asked for the wisdom of the Perl Monks concerning the following question:

A while back I wrote a script that needed to pull out file names from an input line (including the full path) and convert them into the short name equivalent, E.g.:

C:\Program Files\Application Name\bin\app.exe > C:\Program Files\Application Name\log\app.log

would become

C:\PROGRA~1\APPLIC~1\bin\app.exe > C:\PROGRA~1\APPLIC~1\log\app.log

I should point out that the line format could vary. In the case above, a script redirects its output to a log file, but the line may consist of a filename, followed by three other filenames that may be arguments passed to the first file.

Now, getting the short name is easy, I'd just pass the name to Win32::GetShortPathName. However, I never came up with a way to pull out the file names in the first place. In the end, I gave up, and just made it mandatory that the input only had the short names.

I never posted to Perlmonks at the time, cos I didn't want to admit defeat, but this has been nagging at me for a while now, and it's time to come, hat in hand: how do I do this?

UPDATE: Wow, I seem to be confusing people. I hope this explains things: I have a file where each line can contain Windows filenames. I gave two examples, but infact I have no way of knowing in advance what the line will look like. All I know is that

  • There will be filenames, that may be in the long format. I need them in short name format
  • Effectively, anything that can be run from the DOS command line may apear in the files - except that long names must be converted to short names first. At some point, each line will be passed to cmd.exe by a windows service

    Update 2: Effectively what I've done is written cron for windows. I read the crontab and schedule each line. Unfortunately, the crontab lines can't contain long names. I need short names, so I was hoping for a regex of some sort that could do this for me.

    Ultimately, the line will be passed to cmd.exe - so the line can contain anything that cmd can (except long file names - but this is not related to cmd). I'm really bad at regexes. Nothing I've tried comes even close to what I want - which is why I haven't posted anything. As I said, I didn't post to Perlmonks before (I wrote this cron over a month ago), cos I knew I would be asked for my regex attempts, and I was ashamed of my regex feebleness

  • Replies are listed 'Best First'.
    Re: Parsing a list or Win32 filenames
    by ikegami (Patriarch) on Sep 09, 2004 at 20:46 UTC

      The updates don't help any. The question I asked in my original post remains. How can you tell where a file name ends? What you ask is not possible, unless you impose extra restrictions. Let's say the file contains the line:

      C:\Program Files\Application Name\bin\app.exe > C:\Program Files\Appli +cation Name\log\app.log do_it 2> C:\Program Files\Foo Bar\log\app.log

      Should the above return:

      C:\Progra~1\Applic~1\bin\app.exe > C:\Progra~1\Applic~1\log\applog~1 2 +> C:\Progra~1\FooBar~1\log\app.log

      Probably not. You probably wanted:

      C:\Progra~1\Applic~1\bin\app.exe > C:\Progra~1\Applic~1\log\app.log do +_it 2> C:\Progra~1\FooBar~1\log\app.log

      ok, so you might say you'll never place arguments between redirects (although it would be a exception to what you said you allow). What if the file contained the line:

      C:\Program Files\Application Name\bin\app.exe ola c:\file1 ole c:\file +2

      Should the above return:
      C:\Progra~1\Applic~1\bin\appexe~1 c:\file1o~1 c:\file2
      or
      C:\Progra~1\Applic~1\bin\app.exe ola c:\file1o~1 c:\file2
      or
      C:\Progra~1\Applic~1\bin\app.exe ola c:\file1 ole c:\file2
      Probably not the first, but there's no way to know between the bottom two. It depends on what app.exe expects.

      That's why I recommended you quote your paths:

      sub quote { local $_=$_[0]; s/(["\\])/\\$1/g; qq{"$_"} } sub unquote { local $_=$_[0]; s/^"(.*)"$/$1/ or return $_; s/\\(.)/$1/ +g; $_ } while (<IN>) { @items = /("(?:[^"\\]|\\.)+"|[^"\s]+)/g; foreach (@items) { $_ = unquote($_); print(Win32::GetShortPathName($_) || quote($_), ' '); } print("\n"); }

      But that's not all! What the file doesn't exist? It's impossible to get the short file name of a file that doesn't yet exist!

        Yes, I get your point now. I'm suppose that cmd.exe must be making some kind of assumptions in how it wants to interpet the lines. I'm guesing that I have to either quote things or go back to shortnames (or both I suspect). Thanks for your help
    Re: Parsing a list or Win32 filenames
    by bobf (Monsignor) on Sep 09, 2004 at 18:29 UTC

      If you're looking to parse a path into directories and the filename, use File::Basename or File::Spec. If you're just creating a path for the output (log) file, though, I'm not sure why you need the short format in the first place. You should be able to use long (complete) directory and file names on windows - it works for me.

      HTH

      Update: I just realized your question might be refering to parsing the command line, not just the filenames. If that's the case you'll have to look at @ARGV (see perlvar), which contains command line arguments. You might also want to consider using something like Getopt rather than pulling things out of @ARGV yourself.

        Sorry, I wasn't clear. I'm not trying to pull apart a file name into directories and basenames - just loop through a file, take each line with Win32 filenames and convert the long names into short names. The new line will be passed to another application that can't deal with spaces in names
    Re: Parsing a list or Win32 filenames
    by ikegami (Patriarch) on Sep 09, 2004 at 18:40 UTC

      You say the first file name can be "followed by three other filenames". How can you tell where the file names end? Is a b c d 4 file names, or 3 (say 'a', 'b' and 'c d')? Windows and/or many applications support using double quotes to surround arguemnts with spaces in them.

      >perl -MWin32 -e "print(Win32::GetShortPathName($_), ' ') foreach (@AR +GV);" "My Documents" "Start Menu" MYDOCU~1 STARTM~1

      If you want to implement double quoteing yourself...

      sub deslash { local $_=$_[0]; s/\\(.)/$1/g; $_ } s/"((?:[^"\\]|\\.)+)"/ deslash($1) /ge;
    Re:Parsing a list or Win32 filenames
    by bobf (Monsignor) on Sep 09, 2004 at 19:06 UTC

      Ahh - I think I understand the question now (in the future, a few lines of the input file to serve as an example would be helpful). If I am getting this right, you need to read in the input file line-by-line, examine the line for path/filenames, and convert those filenames to short format. What have you tried? It seems fairly straightforward - perhaps split on whitespace and/or a regex to identify the path/filenames on each line. (It is hard to construct a good regex without knowing the format of the lines, so posting that will help.)

        (see update 2)