Please note that I am at an Internet café right now and thus cannot test anything that I am writing, so be gentle with me :)

The regex you listed is better, but you should be aware that if you're working with data that someone else supplies, you may have to deal with escaped quotes. The regex /"([^"]+)"/g will probably not behave as you expect with the following:

my $string = qw!"This is \"data\""!;
So, we try the following:
$string =~ /"((?:\\"|[^"])*)"/;
Break that out:
$string =~ /" # first quote ( # capture to $1 (?: # non-capturing parens \\" # an escaped quote | # or [^"] # a non-quote )* # end grouping (zero or more of above) ) # end capture "/x; # last quote
Looks good. We allow for escaped quotes, but what if the string is something like "test\". That's poorly formed, so we'll probably also have to allow escaped escapes (sigh). That means a string like "test\\". The following should be pretty close to what you want:
$string =~ /"((?:\\["\\]|[^"])*)"/;
It's really ugly, but should be closer to what you are might need. However, regular expressions such as these can get quite hairy. I understand that Text::Balanced is perfect for issues like this, but I have never used it.

Cheers,
Ovid

Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.


In reply to (Ovid - check for escaped quotes) Re(3): (dot star) Re: Repeatable regex. by Ovid
in thread Repeatable regex. by the_0ne

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.