in reply to glob with special characters

No shell is involved. If you strace perl (strace -f -ff on linux), you'll see that no shell is spawned.

AFAIK perl now uses File::Glob internally1, and in fact foo.pl as

$a=q|a\ \\\{bc.d/*|;print "pattern <", $a, "> globs to <", (glob $a), +">\n"; print "$_\n" for sort keys %INC;

yields

pattern <a\ \\{bc.d/*> globs to <a {bc.d/GOT ITabcde> Carp.pm Exporter.pm File/Glob.pm Text/ParseWords.pm XSLoader.pm strict.pm vars.pm warnings.pm warnings/register.pm

while running

$a=q|a\{bc.d/*|;print "pattern <", $a, "> globs to <", (glob $a), ">\n +"; print "$_\n" for sort keys %INC;

results in

pattern <a\{bc.d/*> globs to <a{bc.d/GOT ITabcde> File/Glob.pm XSLoader.pm strict.pm

I suspect a subtlety (bug?) in Text::ParseWords or its usage. The behaviour you describe (which I confirm for my systems) looks like a two-pass parsing whenever a whitespace is present - in that case the \{ seems to get "optimized away" (running with -Dcr):

Guessing start of match, REx "\\(.)" against "\{bc.d/*"... Found anchored substr "\" at offset 0... Guessed: match at offset 0 Matching REx "\\(.)" against "\{bc.d/*" Setting an EVAL scope, savestack=56 0 <> <\{bc.d/*> | 1: EXACT <\\> 1 <\> <{bc.d/*> | 3: OPEN1 1 <\> <{bc.d/*> | 5: SANY 2 <\{> <bc.d/*> | 6: CLOSE1 2 <\{> <bc.d/*> | 8: END Match successful! Guessing start of match, REx "\\(.)" against "bc.d/*"... Did not find anchored substr "\"... Match rejected by optimizer Not present... Match failed

No time to track that down rigth now... smells like eval involved.

1) From doio.c:

=head1 IO Functions =for apidoc start_glob Function called by C<do_readline> to spawn a glob (or do the glob insi +de perl on VMS). This code used to be inline, but now perl uses C<File::G +lob> this glob starter is only used by miniperl during the build process. Moving it away shrinks pp_hot.c; shrinking pp_hot.c helps speed perl u +p. =cut */ PerlIO * Perl_start_glob (pTHX_ SV *tmpglob, IO *io)

--shmem

_($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                              /\_¯/(q    /
----------------------------  \__(m.====·.(_("always off the crowd"))."·
");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}

Replies are listed 'Best First'.
Re^2: glob with special characters
by Hue-Bond (Priest) on Oct 06, 2006 at 21:38 UTC
    I suspect a subtlety (bug?) in Text::ParseWords or its usage.

    The documentation of this module is quite explicit about how it works:

    The $keep argument is a boolean flag. If true, then the tokens are split on the specified delimiter, but all other characters (quotes, backslashes, etc.) are kept in the tokens. If $keep is false then the &*quotewords() functions remove all quotes and backslashes that are not themselves backslash-escaped or inside of single quotes (i.e., "ewords() tries to interpret these characters just like the Bourne shell).

    The *quotewords functions all call parse_line, which is the one that performs the real job. File::Glob calls parse_line with a $keep argument of 0:

    if ($pat =~ /\s/) { # XXX this is needed for compatibility with the csh # implementation in Perl. Need to support a flag # to disable this behavior. require Text::ParseWords; @pat = Text::ParseWords::parse_line('\s+',0,$pat); }

    So, knowing this, it comes as no surprise that some backslashes are being eaten. As soon as I replace the 0 with a 1, the behaviour of glob begins to match my expectations. What bothers me is that File::Glob is one of that pieces of software so widely used that it's impossible that this little humble programmer have found a bug in it :^).

    --
    David Serrano

      Hue-Bond++

      will you file a bug report on this?

      abrazo,
      --shmem

      _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                    /\_¯/(q    /
      ----------------------------  \__(m.====·.(_("always off the crowd"))."·
      ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
      400th post

        Yes, I would be glad about it, and with perlbug it shouldn't be difficult to do it right on the first try. Right now I'm investigating some problems with backslashes in filenames, although I think this is a completely separate issue that should be brought up separately. (Update: Bah! somehow File/Glob.pm had returned to its unchanged status).

        Thank you for updating your answer; at the beginning I didn't understand if you were acknowledging the bug or what :^).

        --
        David Serrano, being nearer to become a Perl hacker