I thought, naively, that the easiest way of matching a string such as 'XOX' or 'TNT' but not 'XXX' or 'TTT' would be:

/(.)[^\1]\1/

...but this doesn't work:

$_ = 'BBB'; print /(.)[^\1]\1/ ? 'match' : 'no match';

This prints match. Indeed, it seems to match if I replace the middle 'B' with any single character (including '1' and '\').

Even more strangely, to me, if I 'un-negate' the character class:

/(.)[\1]\1/

...I don't seem to match anything (BTW, I get no complaints with strictures and warnings).

Context: I'm trying to code a simple substitution cipher solver (eg where 'ABCABC' will match 'murmur' and 'tsetse' but not 'booboo'). I'm aware that there are other ways of doing it, notably merlyn's post here. However, he uses (unless I'm mistaken) negative loook-behind magic, and I haven't got that far in Mastering Regular Expressions yet :-). I'm not looking for a solution, I'm just curious as to what /[^\1]/ and /[\1]/ actually match? I've scoured perlre and done an index search for the (admittedly large :-) part of Mastering Regular Expressions that I still haven't read, but found nothing that appears relevant to my question (that's not to say that there is nothing relevant, just that I might not have understood its relevance...).

TIA for your enlightenment.

dave


In reply to Pattern matching: Why no \1 in [ ]? by Not_a_Number

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.