throop has asked for the wisdom of the Perl Monks concerning the following question:

Brethren

Aren't "\cI" and "\t" equivalent? I'm seeing this oddly different behavior (as seen from the debugger):

DB<1> x "ARTICLE\tNOUN" =~ /^(ARTICLE\cI)(.+)/x 0 'ARTICLE' 1 "\cINOUN" DB<2> x "ARTICLE\tNOUN" =~ /^(ARTICLE\t)(.+)/x 0 "ARTICLE\cI" 1 'NOUN'
The \t gets captured in the first set of parens when I use \t, but not when I use \cI. I don't understantd <1> at all - if the \t isn't captured in the first set of parens, I don't see how the match succeeded. \cI and \t seem to be handled the same way in strings. Is there a subtle difference in regexes?

I'm running v5.8.8 built for i386-linux-thread-multi

Replies are listed 'Best First'.
Re: \cI vs \t in regex (bug)
by tye (Sage) on Apr 24, 2007 at 19:17 UTC

    "bug" in the interaction of \cI and /x. Note:

    > perl -MO=Deparse -e"/\cI/x" / /x; -e syntax OK > perl -MO=Deparse -e"/\t/x" /\t/x; -e syntax OK

    Note that I'd never use \cI since \t is portable.

    - tye        

      Thanks - I'm slightly glad to know that it's a bug and not Yet Another Obscure PERL feature that I'd missed. :-)

      I ran across it because I was trying to run down a failure to match a pattern, where the pattern was stored (uncompiled) in a variable. I printed out the variable in the debugger - although I'd composed it with \t it prints with \cI. So I was cutting and pasting, trying to narrow down where the pattern match was failing - and got horribly bitten.

      Such is war and programming. Meta-question - where is the someplace to report bugs like that?

      throop

      If you are using t as your regex delimiter, \t is a literal t, but \cI is a tab.