in reply to Perl is returning... odd results... from regular expressions. Things matching when they shouldn't, and stuff like that.

First off, I agree with the other posts!

If you are looking for a pattern that has first a quotation mark, then any characters to another quotation mark, an = sign, and finally a comma.

The next pattern being the same as the first without the quotation marks

And finally, any characters, and = sign, and the newline character, then try the re-write and see if you get better results and tweek it from there.

The * and the ? next to each other are redundant especially after the wildcard . (which means any character), and the * meaning 0 or more them.

split /(\".*?\"(?=,))|(.*?(?=,))|(.*?(?=\n))/

I think you are looking something more like this. The second pattern and the first end up being redundant, so I removed the first pattern. Please not that it has been a long time since I have worked on this type of pattern matching, and I may completely missed the mark

$some_value = split (/.*\=,|.*\=\n$/, $some_scalar);

  • Comment on Re: Perl is returning... odd results... from regular expressions. Things matching when they shouldn't, and stuff like that.
  • Download Code

Replies are listed 'Best First'.
Re^2: Perl is returning... odd results... from regular expressions. Things matching when they shouldn't, and stuff like that.
by Groxx (Novice) on Jan 11, 2007 at 08:07 UTC
    Actually, the "?" is useful. It makes it non-greedy. As this CSV file can have multiple strings, if it isn't included, it returns EVERYTHING between the first and last quote mark.

    As to the other question marks, like the (?=,) portion, those are lookaheads.

    I appreciate the reply, though! Thanks!