drewk has asked for the wisdom of the Perl Monks concerning the following question:

I have reached the limit of knowlege my knowlege of BASH and Perl, but I think that I may have figured it out any way. Please tell me, hint me, etc if I did not.

I have a Perl script that is dealing with hundreds of thousands of files. They are potentially poorly or badly named if I want to execute a shell command on the file. Endearingly called 'mean files names.'

I can single quote (') or double quote the mean file name before I execute a shell command, but I still need to deal with the special meaning of some characters. Some files have the single quote as part of the name. Since I cannot escape that, or rename the file, I need to use double quotes.

With Bash, the characters: $ ` \ <newline> " have special meaning within the double quotes.
QUOTING (from the BASH man page)

Quoting is used to remove the special meaning of certain characters or words to the shell. Quoting can be used to disable special treatment for special characters, to prevent reserved words from being recognized as such, and to prevent parameter expansion.

Each of the metacharacters listed above under DEFINITIONS has special meaning to the shell and must be quoted if it is to represent itself.

When the command history expansion facilities are being used, the his- tory expansion character, usually !, must be quoted to prevent history expansion.

There are three quoting mechanisms: the escape character, single quotes, and double quotes.

A non-quoted backslash (\) is the escape character. It preserves the literal value of the next character that follows, with the exception of <newline>. If a \<newline> pair appears, and the backslash is not itself quoted, the \<newline> is treated as a line continuation (that is, it is removed from the input stream and effectively ignored).

Enclosing characters in single quotes preserves the literal value of each character within the quotes. A single quote may not occur between single quotes, even when preceded by a backslash.

Enclosing characters in double quotes preserves the literal value of all characters within the quotes, with the exception of $, `, and \. The characters $ and ` retain their special meaning within double quotes. The backslash retains its special meaning only when followed by one of the following characters: $, `, ", \, or <newline>. A double quote may be quoted within double quotes by preceding it with a back- slash.



The problem I have is the files have no problem within Perl, but I can have problems with the shell if I execute a shell command (qx{} or ` ` or system() type) using the same mean file names.

I believe that I solved it with a regex to add the apropriate escapes to the file name prior to the system() call. The ` " $ are easy and straightforward (I think!) but the \ is a little more worry. Even number multiples of \\ escape to a \ and \` \$ \" escape to those characters.

The following code demonstrates.

#! /usr/bin/perl sub bash_q_esc { # my $text = shift ; $text =~ s/\\\\/\\\\\\\\/g ; $text =~ s/(?<!\\)\\(?!\\)/\\\\/g ; $text =~ s/(?<!\\)\$/\\\$/g ; $text =~ s/(?<!\\)"/\\"/g ; $text =~ s/(?<!\\)`/\\`/g ; return $text ; } my $dir = "/Users/andrew/test" ; #choose a dir for you... while (<DATA>) { chomp ; my $rtr = bash_q_esc($_) ; my $cmd_rtr ; open FH, ">$dir/$_" || die "can't open $_ $?\n" ; close FH ; $cmd_rtr = `ls -l "$dir/$rtr"` ; #harmless shell pipeline print "rtr=$cmd_rtr\n$_\n$rtr\n" ; } print "--------------" ; #mean file names: __DATA__ ` 123 # ` 456 $ e ` file name $ no esc $ in the $ $tring$ "file name in quotes" $ esc \$ in the \$ $tring$ back slashes \ \ end \$ back slash $ start this and end $\ mean file \\\\\\\?\\ fg \n \b ==?`
Did I get ir right???

Replies are listed 'Best First'.
Re: RE and BASH and mean file name shell quoting
by rhesa (Vicar) on Feb 16, 2006 at 06:02 UTC
      This turned out to work for MY part of the project. THANKS!

      Regretable, not all the utilities I am using support 'mean file names' so I am needing to parse the file names down to ascii...

      Thanks again...
Re: RE and BASH and mean file name shell quoting
by Tanktalus (Canon) on Feb 16, 2006 at 17:19 UTC

    Actually, you want to use the list form of system and/or exec. If you have perl 5.8, you can also use the long form of open, such as open my $fh, '-|', 'some_cmd', $mean_file_name. Because you may be surprised when something changes with a shell upgrade, or a shell change. e.g., you give your script to a C-shell-wielding wizard.

    Oh, and you missed !. But if you used the multi-arg version of system, exec, and open, it wouldn't matter since you wouldn't be using any shell.

    I actually had a recent escapade like this at work with a coworker trying to escape things for a system call. In C. Then I told her to stop that, and use fork and exec with the array form of exec. Problem solved. Otherwise, I would have been able to come up with scenario after scenario where her code broke, so better to just fix it once, by completely ignoring the shell.

    (Yes, I know that perl always calls /bin/sh. Unless some crazy goes and modifies $Config::Config{sh}. But don't do that.)