in reply to Re^6: quoting issue with system command (Win32)
in thread quoting issue with system command

Your problems surface from the assumption that there's a standard way of escaping quotes in Windows. Is there one? There didn't used to be.

  • Comment on Re^7: quoting issue with system command (Win32)

Replies are listed 'Best First'.
Re^8: quoting issue with system command (Win32 standard)
by tye (Sage) on May 14, 2009 at 05:50 UTC

    There is actually a documented if little-known standard for escaping double quotes inside of double quotes for Win32 command lines. Programs that are written in C or use Microsoft's routine for parsing command lines into command arguments will follow that standard. Unfortunately, not all programs, even including programs provided by Microsoft, use Microsoft's code for parsing command lines into arguments nor even follow Microsoft's document rules.

    Note that cmd.exe is still in the picture in most cases so not following Microsoft's rules can get you into minor trouble. You can produce a command that is incapable of receiving certain argument values. That is, if you want one of the things that cmd.exe takes care of for you (<, >, |, &, ^, etc.) included in a command-line argument, then you need to quote it in a way that cmd.exe recognizes. That way of quoting it would also need to be compatible with however the particular program chooses to parse its command line.

    Anyway, the way you escape quotes inside of double quotes on Win32 command lines is to preceed them with a backslash. Unlike shell quoting, you can't just throw a double quote into the middle of an argument to start quoting there.

    The specification is incomplete (lots of "behaviors that are undefined") and the implementation is at least a bit perverse which makes it appear quite buggy when one so quickly runs into the unspecified cases. The implementation may even be actually buggy for some cases that are within the specified limits, but I'm not aware of any such clear bugs. Quoting MS documentation:

    Microsoft C/C++ startup code uses the following rules when interpreting arguments given on the operating system command line:
    • Arguments are delimited by white space, which is either a space or a tab.
    • The caret character (^) is not recognized as an escape character or delimiter. The character is handled completely by the command-line parser in the operating system before being passed to the argv array in the program.
    • A string surrounded by double quotation marks ("string") is interpreted as a single argument, regardless of white space contained within. A quoted string can be embedded in an argument.
    • A double quotation mark preceded by a backslash (\") is interpreted as a literal double quotation mark character (").
    • Backslashes are interpreted literally, unless they immediately precede a double quotation mark.
    • If an even number of backslashes is followed by a double quotation mark, one backslash is placed in the argv array for every pair of backslashes, and the double quotation mark is interpreted as a string delimiter.
    • If an odd number of backslashes is followed by a double quotation mark, one backslash is placed in the argv array for every pair of backslashes, and the double quotation mark is "escaped" by the remaining backslash, causing a literal double quotation mark (") to be placed in argv.

    There's also a library routine that does this parsing that can be used by programs not written in C, though I didn't spend the time trying to remember or find the name of it.

    - tye        

        Also, parse_cmdline in file stdargv.c is the code that actually runs to prepare argc/argv for main.
Re^8: quoting issue with system command (Win32)
by almut (Canon) on May 13, 2009 at 16:05 UTC
    No, it works when you do it right (don't specify the quotes).

    So, how would you do it "right" if you need to pass a value that is double-quoted, using the multi-argument form?  I.e., if you want the called program to get a value like '"some value"' (including the double quotes) as one argument.

    (Hmm, seems the content of the node changed while I was replying...)

      Sorry, I wasn't sure what I was trying to say.

      if you want the called program to get a value like '"some value"' (including the double quotes) as one argument.

      I don't know if there's a standard quoting rule for that. If there is, then you're right, it's buggy. (Perhaps intentionally for backwards compatibility reasons?)

      Easy: forget the non-portable and provincial system function and call the platform-specific CreateProcess in the Win32 API, passing it the string you really intend.

        I respectfully disagree. Isn't the basic problem on Windows that the Windows API (i.e. CreateProcess) does simply not allow for several individual command line arguments to be passed to a program (as it is possible on Unix)?  All that CreateProcess allows to be specified is a single string (lpCommandLine) — so I don't really see any advantage of using CreateProcess with respect to quoting issues, compared to using system().

        If there was any way to properly emulate Unix semantics (i.e. separate arguments) on Windows to get around quoting issues etc., I'm pretty sure the Makers of Perl would have utilized that possibility...

Re^8: quoting issue with system command (Win32)
by John M. Dlugosz (Monsignor) on May 15, 2009 at 15:56 UTC
    The de-facto standard is the way that Microsoft's compiler's RTL start-up code does it. I've just copied that function directly when doing my own parsing.

    In stdargv.c,

    A quoted program name is handled here. The handling is much simpler than for other arguments. Basically, whatever lies between the leading double-quote and next one, or a terminal null character is simply accepted. Fancier handling is not required because the program name must be a legal NTFS/HPFS file name.
    Note that the double-quote characters are not copied, nor do they contribute to numchars.
    and then
    Rules: 2N backslashes + " ==> N backslashes and begin/end quote
    2N+1 backslashes + " ==> N backslashes + literal "
    N backslashes ==> N backslashes
    This means that if the EXE file name does have a 0x22 (QUOTATION MARK) in it, it won't parse out the argv/argc properly. Contrary to the comment, you can have a file by this name, and there are other file systems besides NTFS. Not using [<>:"/\|?*] in a file name on any filesystem is a Win32 convention, not a file system limitation.