Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Regex bug or head made of solid bone?

by alien_life_form (Pilgrim)
on Jun 06, 2002 at 12:54 UTC ( [id://172169]=perlquestion: print w/replies, xml ) Need Help??

alien_life_form has asked for the wisdom of the Perl Monks concerning the following question:

Greetings,

Consider the following snippet:

$orig='aaa([\+\-]{1})bbb'; $qre="\\Q([\\+\\-]{\\E\\d\\Q})\\E"; $bre='\(\[\\\\\+\\\-\]\{\d\}\)'; $str=$orig; $str =~ s!\Q([\+\-]{\E\d\Q})\E!!; print "Replace against literal quoted: $str\n"; $str=$orig; $str =~ s!\(\[\\\+\\\-\]\{\d\}\)!!; print "Replace against literal bslashed: $str\n"; $str=$orig; $str =~ s!$qre!!; print "Replace against variable quoted => $qre <=: $str\n"; $str=$orig; $str =~ s!$bre!!; print "Replace against variable bslashed => $bre <=: $str\n";
Running this (with either 5.6.1 or 5.8.0RC1:
c:\temp> D:\perl58\bin\perl.exe re.pl D:\perl58\bin\perl.exe re.pl Replace against literal quoted: aaabbb Replace against literal bslashed: aaabbb Replace against variable quoted => \Q([\+\-]{\E\d\Q})\E <=: aaa([\+\-] +{1})bbb Replace against variable bslashed => \(\[\\\+\\-\]\{\d\}\) <=: aaabbb c:\temp>perl re.pl perl re.pl Replace against literal quoted: aaabbb Replace against literal bslashed: aaabbb Replace against variable quoted => \Q([\+\-]{\E\d\Q})\E <=: aaa([\+\-] +{1})bbb Replace against variable bslashed => \(\[\\\+\\-\]\{\d\}\) <=: aaabbb C:\temp\perl -v This is perl, v5.6.1 built for MSWin32-x86-multi-thread (with 1 registered patch, see perl -V for more detail)
Uh? Am I missing something?
Cheers,
alf
You can't have everything: where would you put it?

Replies are listed 'Best First'.
Re: Regex bug or head made of solid bone?
by Abigail-II (Bishop) on Jun 06, 2002 at 18:32 UTC
    I don't know what exactly your question is. You show us some code, its output, and then just wonder if you are missing something. *I* certainly am missing something, namely your expectation of what the code should do.

    But I take an educated guess. You are baffled by the behaviour of \Q and \E. One should realize that \Q and \E are dealt with in the interpolation stage, which happens before a regex is compiled. I refer to the section "Gory details of parsing quoted constructs" in the perlop manual.

    Abigail

      Greetings,

      After more grinding, my question actually became two questions. The first is actually about single quoting. The second one is addressed by your guess.

      (And I surely thought that my original post contained sections that qualify as 'some code' and 'output' but this may not be relevant at this point.)
      Cheers,
      alf


      You can't have everything: where would you put it?
Re: Regex bug or head made of solid bone?
by vladb (Vicar) on Jun 06, 2002 at 13:30 UTC
    I've tried to play with this code:
    $orig='aaa([\+\-]{1})bbb'; $res = '([\\+\\-]{'; $qres = '\\Q$res\\E'; $str=$orig; $str =~ s!\Q$res\E!!; print "Replace against variable quoted => $res <=: $str\n"; $str=$orig; $str =~ s!$qres!!; print "Replace against variable quoted => $qres <=: $str\n";
    Examining its output:
    Replace against variable quoted => ([\+\-]{ <=: aaa1})bbb Replace against variable quoted => \Q$res\E <=: aaa([\+\-]{1})bbb
    I notice that placing special metacharacters inside your variable doesn't make them 'effective'. However, when they are outside of the variable, they get 'noticed' by the regular expression parser. So, could it be the answer to your inquiry? ;-)

    _____________________
    $"=q;grep;;$,=q"grep";for(`find . -name ".saves*~"`){s;$/;;;/(.*-(\d+) +-.*)$/; $_=["ps -e -o pid | "," $2 | "," -v "," "];`@$_`?{print"+ $1"}:{print" +- $1"}&&`rm $1`; print$\;}
      Greetings,

      interesting...
      I wonder whether this is by design or by accident?

      Looking back at my original example, another thing is appears to be non-obvious. The (backslashed) portion that reads:

      \\\\\+\\\- # matches literal \+\- in a variable
      had originally been written as:
      \\\+\\\- # matches literal \+\- in a literal regex
      why do I need an additional '\' before a quoted + and only within a variable is beyond me.... Same reason?
      Cheers,
      alf
      You can't have everything: where would you put it?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://172169]
Approved by broquaint
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (3)
As of 2024-04-26 04:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found