update: No -- Crackers2 is correct. Sorry. First time I've wanted to downvote my own post.

Well, it seems your character class isn't working the way you expect. I usually find the print statement to be an excellent debugger. I modified your code a bit -- first, I didn't see a particular need for the \A in this short a sample. I'm also used to looking at regexes without whitespace, and I'm not sure why you used both \s and \m modifiers (aren't they contradictory?){update: never mind that last -- I found "both s and m modifiers (//sm): Treat string as a single long line, but detect multiple lines. '.' matches any character, even "\n". ^ and $, however, are able to match at the start or end of any line within the string." in the docs}
$foo = qq{^snafu^|^foobar^\n}; $foo =~ m/(\W)([^\1]+)\1(\W)/; $text_qual = $1; $field_sep = $3; print "text: $text_qual\n"; print "field: $field_sep\n"; print $2; print "\n2nd try\n"; $foo2 = qq{^snafu1|^foobar^\n}; $foo2 =~ m/(\W)([^\1]+)\1(\W)/; $text_qual = $1; $field_sep = $3; print "text: $text_qual\n"; print "field: $field_sep\n"; print $2;
yields
H:\script>perl majingz.pl text: ^ field: snafu^|^foobar 2nd try text: ^ field: snafu1|^foobar
Telling me your [^\1]+class sucked up everything from s to r, and then the \1 kicked in for the fourth ^. So, your backreference isn't working inside a character class. This isn't quite so surprising (to me, anyway) since a character class doesn't follow many standard regex rules (a period inside a character class, for example, is just a period, escaped or not). I don't see any hard documentation on the failure of backreferences in character classes, but it makes sense to me.

What is somewhat surprising to me is that for the second try the match for the second try (where I assume the "1" is part of the character class), the "1" doesn't trigger the class. I'm guess this is because the "\" is the escape character. I do note that if I double escape (i.e., [^\\1]), I get the expected result of that class matching on the "1".

I wish I could tell you how to resolve your situation, but I think it's a difficult one: parsing csv's is not an easy task. That's one reason there's a module.

In reply to Re: Regular Expresssion TroubleShoot Help plz by SamCG
in thread Regular Expresssion TroubleShoot Help plz by MajingaZ

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.