Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

If I apply this regex : /"[^"\r\n]*"/ to this string :

Houston, we have a problem with "string one" and "string two". Please respond.

It will match (1) "string one" and (2) "string two" .

OK.. but why not (between these two), a third match :

" and "

Thanks in advance.

Code tags added by GrandFather

Replies are listed 'Best First'.
Re: Regex Question
by moritz (Cardinal) on Aug 28, 2008 at 20:12 UTC
    The regex seems to be /"[^"\r\n]*"/.

    If you apply that regex once, you get "string one". If you apply it again, you get the same match.

    If you apply it with the /g modifier, you get the same first match, and it will set pos to the position after the second quote. When you match a second time, it will start to look for a match starting from pos.

    So this way you'll never get overlapping matches. If that is what you want, you can use this regex:

    m/"(?=([^"\r\n]*"))/g

    Which will put even overlapping matches consecutively into $1.

Re: Regex Question
by didess (Sexton) on Aug 28, 2008 at 20:53 UTC
    Hello, If your RE aims : any text between a pair of double quotes (except double quotes, CR and LF), you should write it:
    "[^"\r\n]*".
    I'm not shure of the useness of CR and LF here. Anyhow, this RE will always match "string one" because this is the first match (from left to right). It will never match "string two" or " and ". Good answer ?