in reply to Re: Re: Parsing strings into @ARGV
in thread Parsing strings into @ARGV

Yes, it's a bug. This is the offending part:
sub shellwords { local(@lines) = @_; $lines[$#lines] =~ s/\s+$//; return(quotewords('\s+', 0, @lines)); }
The bug is the unrestricted deletion of trailing whitespace - it shouldn't delete whitespace that's preceeded by the right number of backslashes. I'll see if I can come up with a patch later tonight.

Note that that's not the only case where Text::ParseWords and my shell (bash) disagrees. Given the string "foo\'bar", my shell parses that as foo\'bar, while Text::ParseWords turns it into foo'bar.

Abigail

Replies are listed 'Best First'.
Re: Re: Parsing strings into @ARGV
by mirod (Canon) on Apr 23, 2004 at 16:15 UTC

    Yes, I saw that, here is my fix:

        $lines[$#lines] =~ s{(\\.)?\s*$}{defined $1 ? $1 : ''}e;

    I don't really like it but it passes my tests.:

    #!/usr/bin/perl -w use strict; use Test::More qw(no_plan); while( <DATA>) { my( $input, $expected)= m{^'(.*?)'\s*=>\s*'(.*?)'}; (my $trimmed= $input)=~ s{(\\.)?\s*$}{defined $1 ? $1 : ''}e; is( $trimmed, $expected, "'$input'"); } __DATA__ 't' =>'t' 't ' =>'t' 't\ ' =>'t\ ' 't\ ' =>'t\ ' 't\ \ '=>'t\ \ '
      Your solution fails on 't\\\ ', yielding 't\\\', but I expect 't\\\ '.

      $ echo t\\\ | cat -e t\ $

      Abigail

        Let's try again, trying to keep the regexp anchored at the end:

        s{(\\*)(\s+)$}{ if( $1) { if( length( $1) % 2) { $1 . substr( $2, 0, 1) } else { $1 } } else { '' } }e;

        A bit ugly, but I would think it works: keep the first whitespace in the final sequence if it comes after an odd number of \

        s{^((\\.|.)*?)\s*$}{defined $1 ? $1 : ''}e; works too, but I would suspect it is slower (no, I don't have a benchmark to prove it).

        The test again:

        #!/usr/bin/perl -w use strict; use Test::More qw(no_plan); while( <DATA>) { my( $input, $expected)= m{^'(.*?)'\s*=>\s*'(.*?)'}; #(my $trimmed= $input)=~ s{^((\\.|.)*?)\s*$}{defined $1 ? $1 : ''} +e; (my $trimmed= $input)=~ s{(\\*)(\s+)$}{ if( $1) { if( length( $1) % 2) { + $1 . substr( $2, 0, 1) } else { $ +1 } } else { '' } }e; is( $trimmed, $expected, "'$input'"); } __DATA__ '' => '' 't' => 't' 't ' => 't' 't\ ' => 't\ ' 't\ ' => 't\ ' 't\ \ ' => 't\ \ ' 't\\\ ' => 't\\\ ' 't\\ ' => 't\\'
Re: Re: Parsing strings into @ARGV
by mirod (Canon) on Apr 23, 2004 at 16:24 UTC

    bash, sh, csh and tcsh all give foo\'bar. This one looks more of a pain to fix.

    Are there tests for Text::ParseWords somewhere? The CPAN version has nothing, which makes it quite dangerous to work on it.

      There are some really old tests in bleadperl. I think they might date back to 5.6.0 or 5.6.1.