Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Escape special chars in a path

by ovedpo15 (Pilgrim)
on Aug 17, 2022 at 15:30 UTC ( [id://11146197]=perlquestion: print w/replies, xml ) Need Help??

ovedpo15 has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks!
My Perl utility generates a bash script that consists of mkdir/rsync/cp commands.
This bash script is later used by users (this means that I don't want to actually run those commands when my utility runs, rather just to generate the script).
Given a UNIX path, I need to do two different actions - depending on the path type (dir or file):
1. If the path is a directory, then just create it using mkdir.
2. If the path is a file, then just copy the file from the dir directory using rsync or cp (depending if user specified a machine to copy from).
For example, consider this:
touch /a/b/c/d1
In that case, the bash script will look like:
mkdir -p /tmp/a/b/c cp /a/b/c/d1 /tmp/a/b/c # Or: rsync -a $USER@MACHINE:/a/b/c/d1 /tmp/a/b/c
The utility works good, unless a path contains "special chars".
I tried to deal with it by escaping and using quotes but I can't seem to cover all cases.
By "special chars" I mean chars like ":",";","(",")","_",....
I tried to use the following to subs:
sub escape { my ($path) = @_; if ($path =~ /\\/) { $path =~ s/\\/\\\\/g; } if ($path =~ /\$/) { $path =~ s/\$/\\\$/g; } return $path; } sub wrap_with_quotes { my ($path) = @_; if ($path =~ /( |\;|\!)/) { return '"'.$path.'"'; } return $path; }
I also tried to use quotemeta:
sub escape { my ($path) = @_; my $new_path = quotemeta($path); $new_path =~ s/\\\//\//g; return $new_path; }
But it also failed for a lot of cases and it escaped alot of unneeded chars (like ".", "/", etc. - which are valid in paths without escaping).
The code looks like:
foreach my $dir (sort(keys(%dirs))) { $dir = escape($dir); $dir = wrap_with_quotes($dir); print("mkdir -p /tmp/$dir\n"); } foreach my $file (sort(keys(%files))) { my $parent_dir = dirname($file); my $abs_path = abs_path($file); $abs_path = escape($abs_path); $abs_path = wrap_with_quotes($abs_path); $parent_dir = escape($parent_dir); $parent_dir = wrap_with_quotes($parent_dir); print("cp $abs_path /tmp/$parent_dir\n"); } foreach my $file (sort(keys(%remote_files))) { my $parent_dir = dirname($file); my $abs_path = abs_path($file); my $host = get_host(); $abs_path = escape($abs_path); $abs_path = wrap_with_quotes($abs_path); $parent_dir = escape($parent_dir); $parent_dir = wrap_with_quotes($parent_dir); print("rsync -a $host$abs_path /tmp/$parent_dir\n"); }
I of course want to support any kind of path. For example, the special char could contain a "\" before it, and then I need to escape both of them. I built a small test for you to understand what I'm after:
declare -a special_chars=("!" "@" "#" "$" "%" "^" "_" "-" "=" "+" "[" +"]" "(" ")" "{" "}" "'" ":" "," "." ";" " " "\"" "<" ">") if [ "$1" == 1 ]; then # create playground (before running the bash sc +ript) for special_char in "${special_chars[@]}"; do mkdir -p "/test1/a${special_char}b" touch "/test1/a${special_char}b/data" done else # test playground output (after running the bash script) for special_char in "${special_chars[@]}"; do mkdir -p "/tmp/test1/a${special_char}b" if [ "$?" -ne 0 ]; then exit 1 fi done fi
If 1 is passed to the script, it will generate directory with one special char (for example: test1/a;b).
Then I run the generated bash script and then the test script again - if 0 is passed, it will check if the bash script successfully created dirs & copied files into /tmp.
Hope it makes sense.
I also noticed that rsync and cp except different escaping. For example, "/test1/a;b/data" works for cp and "/test1/a\;b/data" works for rsync.
Is there an easy way to handle special chars in path? All I want is to create mkdir/cp/rsync commands in a bash script that so they will later work.
Please help me to fix the wrap_with_quotes and escape subs or find a better way.

Replies are listed 'Best First'.
Re: Escape special chars in a path
by choroba (Cardinal) on Aug 17, 2022 at 18:25 UTC
    There's String::ShellQuote on CPAN.

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
      There's String::ShellQuote on CPAN.

      A name that promises much more than the module actually implements. With a good amount of luck, it may work for this very specific problem ("My Perl utility generates a bash script that consists of mkdir/rsync/cp commands. This bash script is later used by users [...]"). After all, bash should be a bourne-compatible shell. But don't assume that module can do everything its name promises.

      Quoting myself:

      The last two times I looked at String::ShellQuote (Re^2: Passing values from Perl script to shell script in 2009 and Re^4: quoting/escaping file names in 2014), it did not look good. The last change is from 2010. So, String::ShellQuote still has the same problems as in 2009 [...]: It works just for some unnamed version of some unnamed bourne shell. The author wanted to add more shells, but he did not since 2005. The test.t look very strange, especially I don't see any reasonable test for passing arguments via a shell. So, it's old, unmaintained, not well-tested, and broken for all shells except for that unspecified bourne shell.

      See also:

      Update:

      I had a closer look at the source code and did some tests: Problems with String::ShellQuote

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
Re: Escape special chars in a path
by johngg (Canon) on Aug 17, 2022 at 21:26 UTC

    This is more of an observation rather than answering your question but using alternative delimiters in your substitutions can make the code much easier to read. Also I find that it is easier to use the hex value for a backslash rather than the character itself, e.g. converting to Unix-style paths ...

    johngg@aleatico:~$ perl -Mstrict -Mwarnings -E 'say q{}; my( $path ) = qw{ C:\Users\fred\file.txt }; say $path; $path =~ s{\x5c}{/}g; say $path;' C:\Users\fred\file.txt C:/Users/fred/file.txt

    Going the other way, I would use a lookahead, the chr function and the e modifier in a substitution to do escaping.

    johngg@aleatico:~$ perl -Mstrict -Mwarnings -E 'say q{}; my( $path ) = qw{ C:\Users\fred\file.txt }; say $path; $path =~ s{(?=\x5c)}{ chr 92 }eg; say $path;' C:\Users\fred\file.txt C:\\Users\\fred\\file.txt

    It seems easier to my eyes. I hope this is helpful.

    Cheers,

    JohnGG

Re: Escape special chars in a path
by kcott (Archbishop) on Aug 18, 2022 at 01:38 UTC

    G'day ovedpo15,

    I'd use a bracketed character class. A negated one is possibly the easiest. If you want to individually specify special characters, see "Special Characters Inside a Bracketed Character Class". Here's an example.

    #!/usr/bin/env perl use strict; use warnings; use Test::More; my @tests = ( ['a!b@c#', 'a\!b\@c\#'], ['{d}e(f)', '\{d\}e\(f\)'], [q{g"h'i}, q{g\"h\'i}], ['[/].\\', '\[/\].\\\\'], ); plan tests => 0+@tests; my $re = qr{([^0-9A-Za-z./])}; for my $test (@tests) { my ($got, $exp) = @$test; $got =~ s/$re/\Q$1/g; ok $got eq $exp; }

    Output:

    1..4 ok 1 ok 2 ok 3 ok 4

    Extend those tests with real pathnames from your application.

    You may need different regexes for rsync and cp (depends on whether "except" should be "accept" or "expect" :-)

    — Ken

Re: Escape special chars in a path
by tybalt89 (Monsignor) on Aug 18, 2022 at 22:40 UTC

    bash quoting is simple: put single quotes around the string after replacing every ' in the string with '"'"'

    sub bashquote { "'" . shift =~ s/'/'"'"'/gr . "'" }

    Sure, sometimes the wrapping single quotes are not needed, but if they are just around directory names and file names, they don't hurt.

      Used as a base for a repair attempt for String::ShellQuote: Re: Problems with String::ShellQuote

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11146197]
Approved by marto
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (6)
As of 2024-04-19 10:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found