in reply to (bbfu) (dot star) Re: Repeatable regex.
in thread Repeatable regex.

Ok, I changed it to this...

myString = join ('~', ( $myString =~ /"([^"]+)"/g ) );

Sorry, didn't know about the "Death to Dot Star", but the above modification does work.

Thanks.
  • Comment on Re: (bbfu) (dot star) Re: Repeatable regex.

Replies are listed 'Best First'.
(Ovid - check for escaped quotes) Re(3): (dot star) Re: Repeatable regex.
by Ovid (Cardinal) on Apr 03, 2001 at 12:19 UTC
    Please note that I am at an Internet café right now and thus cannot test anything that I am writing, so be gentle with me :)

    The regex you listed is better, but you should be aware that if you're working with data that someone else supplies, you may have to deal with escaped quotes. The regex /"([^"]+)"/g will probably not behave as you expect with the following:

    my $string = qw!"This is \"data\""!;
    So, we try the following:
    $string =~ /"((?:\\"|[^"])*)"/;
    Break that out:
    $string =~ /" # first quote ( # capture to $1 (?: # non-capturing parens \\" # an escaped quote | # or [^"] # a non-quote )* # end grouping (zero or more of above) ) # end capture "/x; # last quote
    Looks good. We allow for escaped quotes, but what if the string is something like "test\". That's poorly formed, so we'll probably also have to allow escaped escapes (sigh). That means a string like "test\\". The following should be pretty close to what you want:
    $string =~ /"((?:\\["\\]|[^"])*)"/;
    It's really ugly, but should be closer to what you are might need. However, regular expressions such as these can get quite hairy. I understand that Text::Balanced is perfect for issues like this, but I have never used it.

    Cheers,
    Ovid

    Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.