in reply to Problems with String::ShellQuote

Let's try to be a little bit positive. How could String::ShellQuote be fixed?

(Or: how can I do something usefull while waiting for the four-hour offline RAID check on one of our servers?)


tybalt89++ has posted a good starting point in Re: Escape special chars in a path:

bash quoting is simple: put single quotes around the string after replacing every ' in the string with '"'"'

sub bashquote { "'" . shift =~ s/'/'"'"'/gr . "'" }

Sure, sometimes the wrapping single quotes are not needed, but if they are just around directory names and file names, they don't hurt.

This has to be done for every single argument to be passed to the bash (or, in fact, any bourne shell).

s///gr does not work in older perls, but that's easy to fix. Something like this:

# untested sub bashquote { local $_ = shift; s/'/'"'"'/g; return "'$_'"; }

String::ShellQuote has an important point: There is no way to pass NUL characters as argument, simply because NUL is used as string terminator. So you want to report that problem. Also, there is no way to pass Perl's undef, so also report that:

# untested sub bashquote { local $_ = shift; defined or die "Argument must not be undef"; /\x00/ and die "String must not contain NUL characters"; s/'/'"'"'/g; return "'$_'"; }

String::ShellQuote uses a slightly shorter way to pass a single quote, \' instead of "'". It should not matter much, but it keeps the returned string shorter. That means we might avoid a limit for the length of a command:

# untested sub bashquote { local $_ = shift; defined or die "Argument must not be undef"; /\x00/ and die "String must not contain NUL characters"; s/'/'\''/g; return "'$_'"; }

String::ShellQuote removes redundant leading and trailing empty single quoted strings from the returned value. Again, it should not matter much, but it shortens the returned string. But we must handle the empty string in a special case, or else it would be returned as unquoted empty string:

# untested sub bashquote { local $_ = shift; defined or die "Argument must not be undef"; /\x00/ and die "String must not contain NUL characters"; length or return "''"; s/'/'\''/g; $_ = "'$_'"; s/^''//; s/''$//; return $_; }

String::ShellQuote is quite clever about shortening multiple successive single quotes, by switching from \' to double quotes starting at two successive single quotes:

# untested sub bashquote { local $_ = shift; defined or die "Argument must not be undef"; /\x00/ and die "String must not contain NUL characters"; length or return "''"; s/'/'\''/g; s|((?:'\\''){2,})|q{'"} . (q{'} x (length($1) / 4)) . q{"'}|ge; $_="'$_'"; s/^''//; s/''$//; return $_; }

There is one possible extra step: Avoid quoting if the argument only conatins "safe" characters. ASCII letters and ASCII digits should be safe. Underscore should be save. <Update>Forward slash should be safe.</Update> A lonely "." as first part of a command in bash is short for "source", so that's not safe. "!" is not safe, as shown. And I won't bother researching all of the other interpunction characters for edge cases. So let's add that shortcut:

# untested sub bashquote { local $_ = shift; defined or die "Argument must not be undef"; /\x00/ and die "String must not contain NUL characters"; length or return "''"; # before adding "/" to safe characters: /^[A-Za-z0-9_]+$/ and retu +rn $_; m|^[A-Za-z0-9/_]+$| and return $_; s/'/'\''/g; s|((?:'\\''){2,})|q{'"} . (q{'} x (length($1) / 4)) . q{"'}|ge; $_="'$_'"; s/^''//; s/''$//; return $_; }

Update: Added / as safe character.

This should give us a quoting function for a single string that returns a string that is safe to use anywhere in a bash command, and often as short as possible. If in doubt, the returned string may contain redundant quotes. This should work with any 8-bit-clean bourne shell.

That's essentially the inner part of the foreach loop in String::ShellQuote. Now lets build the list return that returns a properly quoted string that a bourne shell treat as a list of strings. How hard can that be?

# untested sub shell_quote { return join(' ',map { bashquote($_) } @_); }

Update: added a missing ) and the # untested comment

No position-dependent special cases for equal signs. No special cases for lonely periods or exlamation marks. They are always quoted, period.

What about shell_quote_best_effort()?

There is simply no "best effort" way. Either the result is properly quoted, or data will be damaged. shell_quote_best_effort() should be deprecated and finally be removed. Until then, it could be an alias for shell_quote().

And finally shell_comment_quote(). It works, I would have written it slightly different, but there is no change needed in the code. The documentation should mention the problem of ancient bourne shells that do not implement comments.

Alexander

--
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)