FWIW and just as a matter of interest, the reason your OPed regex
my @strings = $data =~ /\"[^\"]+\"/g;
was "... extracting almost every line..." may be because it will not handle an empty (i.e., zero-length) string properly: the [^\"]+ regex sub-expression requires at least one non-double-quote character. If there is any "" empty string in the text, parsing would get "out of sync" by taking the end quote of the empty quote as the start of the spurious body of a quote.
Output:use warnings; use strict; use Data::Dump qw(dd); my $data = do { local $/; <DATA> }; my @strings = $data =~ /\"[^\"]+\"/g; dd \@strings; __DATA__ nothing "hello" foo "bar" quz "hello2" "world" foo2 "bar2" quz2 "baz" blah blah2 "" blah3 many lines of unquoted stuff "example 1 for instance"
Note that [^"] "not a double-quote" includes the newline character.c:\@Work\Perl\monks\kepler>perl extract_double_quote_bodies_2.pl [ "\"hello\"", "\"bar\"", "\"hello2\"", "\"world\"", "\"bar2\"", "\"baz\"", "\" blah3\nmany\nlines\nof\nunquoted stuff\n\"", ]
Update: Also note that /"[^"]+"/g and /"[^"]*"/g will not properly handle a double-quoted string containing an escaped double-quote (e.g., "x\"y") and will end up "out of sync" in the same way as /"[^"]+"/g with an empty string.
Give a man a fish: <%-{-{-{-<
In reply to Re: Extract pattern match from file
by AnomalousMonk
in thread Extract pattern match from file
by kepler
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |