EigenFunctions has asked for the wisdom of the Perl Monks concerning the following question:

I am making the transition from windoz (win7) to Ubuntu (18.04) and I can't seem to get Perl to understand path/file names. The path and the file names always have spaces and special characters (![]# etc.). I have tried every combination of quoting I could think of and nothing seems to work. I know the file exists because I copy and paste the text on the command line and do an ls -l and the file exists.

Here is some code

: : #my $BibleFullNames = q{~/Aoo/01-Writer/Other/UMC~~ Changewater/[01] +Team Leader- Worship Service/Bible Books, Full Names.txt}; #my $BibleFullNames = q{~/Aoo/01-Writer/Other/UMC\~\~\ Changewater/\[ +01\]\ Team\ Leader-\ Worship\ Service/Bible\ Books,\ Full\ Names.txt} +; #my $BibleFullNames = q{"~/Aoo/01-Writer/Other/UMC~~ Changewater/[01] + Team Leader- Worship Service/Bible Books, Full Names.txt"}; #my $BibleFullNames = q{'~/Aoo/01-Writer/Other/UMC~~ Changewater/[01] + Team Leader- Worship Service/Bible Books, Full Names.txt'}; #my $BibleFullNames = q{"~/Aoo/01-Writer/Other/UMC\~\~\ Changewater/\ +[01\]\ Team\ Leader-\ Worship\ Service/Bible\ Books,\ Full\ Names.txt +"}; my $BibleFullNames = q{'~/Aoo/01-Writer/Other/UMC\~\~\ Changewater/\[ +01\]\ Team\ Leader-\ Worship\ Service/Bible\ Books,\ Full\ Names.txt' +}; : : if (!(-e $BibleFullNames)) { Log2die("**Fatal err, can't find Bible names file (\"$BibleFullNames +\")"); } : :

I would uncomment each variation, one at a time, and then run the program only to see each fail to see the file

Any help would be appreciated

Thanks,
  EigenFunctions
  OpSys: Win7 x64 Service Pack 1 Professional/Home Premium; Perl: Strawberry (v5.22.0)/ActiveState (v5.14.2)

  OpSys: Ubuntu 18.04; Perl: Perl 5, version 26, subversion 1 (v5.26.1) built for x86_64-linux-gnu-thread-multi (with 67 registered patches)

Replies are listed 'Best First'.
Re: Ubuntu File Names with spaces and special characters
by bliako (Abbot) on Feb 26, 2019 at 12:52 UTC

    The problem arises because you quote AND escape (\). Choose one or the other.

    Secondly, the ~ character has meaning within the shell. Perl, I guess, sees ~ as part of the filename and does not expand it to mean home-dir as the unix shell would. So avoid using linux shell shortcuts in a Perl expression.

    This sort of quoting should work but notice that I removed the tilde, you need to find a way to substitute it with a Perl equivalent, perhaps $ENV{HOME}? :

    my $BibleFullNames = $ENV{HOME}."/Aoo/01-Writer/Other/UMC\~\~\ Change +water/\[01\]\ Team\ Leader-\ Worship\ Service/Bible\ Books,\ Full\ Na +mes.txt"; if (!(-e $BibleFullNames)) { Log2die("**Fatal err, can't find Bible names file (\"$BibleFullNames +\")"); }

    EDIT: you may find this useful: Re^3: quoting/escaping file names, it mentions String::ShellQuote which will do all the dirty work for you.

    EDIT2: the more I think about it the more I am convinced that you SHOULD use a module for escaping your filenames (String::ShellQuote works only for Bourne shell, I would search for the equivalent of DBI's quote()), and also use a module to handle the path separator (/ or \\ or // or whatever). Then your Perl script will be portable and work in the 2 OS without need to further change this or that. For example, here is how I construct paths in a portable way: use File::Spec; my $path = File::Spec->catfile('a', 'b', 'c');.

    EDIT3(24h+): the difference between $x = "\ "; $y = '\ '; is that $x will contain a space. Whereas $y will contain a backslash followed by a space. Perl single quotes do not interpolate their contents, unlike the double quotes (escaped characters and also variables). In $y, the backslash is kept in the string and is passed on to anyone who consumes it, shell included. So, if you want to pass a filename to the shell for processing, as in for example system("ls $file") then $file should contain literal backslashes to meta-escape any space in there, as in $y. That or system("ls '$file'") (notice the single quotes) when $file is somwething like $x and it does not contain any backslashes. If you do -e $file then $file should be something like $x. Because if there is a backslash character in $file which does not serve to escape any following character then perl interprets it as part of the filename. So there are some queer characters which in the 70's were marginalised and discriminated against but now have rightly claimed equal treatment being part of the ASCII-kind and all. Still, old software demands that these characters are escaped. Escaping takes place in many levels simply because one program passes on parameters to another, a single backslash is lost from one passing to the next. So Perl passes on parameter to the shell (e.g. via a system ls)? Then make sure that parameter includes backslashes as means to escape funny characters. Does that system calls yet another command which calls another command? Then you may have to escape twice. And so on. I am definetely not the expert in escaping though and I may have missed some crucial points.

    bw, bliako

      [...] Perl single quotes do not interpolate their contents, unlike the double quotes [...]

      One fine point to add is that the delimiting character needs to be escaped in single quotes and the backslash itself will be escaped if there are two in a row.

      use warnings; use strict; #my $y1 = ' \'; ## error my $y1 = ' \\'; ## produces ' \' #my $y2 = q{\}; ## error my $y2 = q{ \\}; ## produces ' \' my $y3 = q{\\share\dir\\}; ## produces '\share\dir\' my $y4 = ' \''; ## produces " '" my $y5 = q{ \}}; ## produces ' }' print "y1<$y1>\n"; print "y2<$y2>\n"; print "y3<$y3>\n"; print "y4<$y4>\n"; print "y5<$y5>\n"; __DATA__ y1< \> y2< \> y3<\share\dir\> y4< '> y5< }>

        Thanks for adding that. Plenty of things I was not aware of.

      That explains a lot! My problem is that the squiggly (aka tilde) is NOT interpreted within SINGLE quotes. I tried String::ShellQuote, but it simply enclosed the input with single quotes. I suppose using the module helps with portability, but I doubt I'll ever need to worry about porting the code.

      What about the shell? Is it the login shell (Bash) or the shell the Perl source is invoked from (tcsh) or just Linux? In the case of paths, I don't think it matters, but other instances it may make a difference (e.g., back-ticks, etc.).

      In any event I believe that solves the problem and maybe, many other problems I've had.

      Thanks,
        EigenFunctions
        OpSys: Ubuntu 18.04; Perl: Perl 5, version 26, subversion 1 (v5.26.1) built for x86_64-linux-gnu-thread-multi (with 67 registered patches)   OpSys: Win7 x64 Service Pack 1 Professional/Home Premium; Perl: Strawberry (v5.22.0)/ActiveState (v5.14.2)

        That explains a lot! My problem is that the squiggly (aka tilde) is NOT interpreted within SINGLE quotes.

        The tilde ~ is in general not special to Perl. To get the home directory, there's e.g. File::HomeDir, or you could use glob - but be aware that it'll treat any glob metacharacters as special, and there are several caveats when it comes to glob, which is why you might be interested in Path::ExpandTilde instead.

        What about the shell? Is it the login shell (Bash) or the shell the Perl source is invoked from (tcsh) or just Linux?

        That's not always clear, which is one of the reasons I usually recommend to avoid the shell altogether.