Okay, the title is kind of a joke. It's just a good-natured tweak at merlyn for the brouhaha over his WARNING t0mas wrote BAD CODE node that generated so much flak. No offense intended :)
merlyn's code was bugging me, but I couldn't quite put my finger on it. My problem was that the dot metacharacter is so indiscriminating that it will match anything. However, I simply assumed that if merlyn posted the code, it must work. His code is great if you're checking for C-style comments that begin and end in something like /* comment here */ or "? comment here ?". But if you read my post, that's not what we were checking for:
What happens if you were trying to extract questions in quotes without the trailing question mark?
I mentioned embedded question marks (my idea was that we might have more than one question in a quote), but I never mentioned embedded quotes. I just wanted one set of quotes and my original post bears that out. Here's merlyn's code and my correction:
#!/usr/bin/perl -w
$myvar = q{ abc"def"g"hi?"jkl };
# This regex is from merlyn
print "matched <$1>\n" if
$myvar =~ /" # First quote
( # Capture text to $1
(?: # Non-backreferencing parentheses
(?!\?") # not question quote?
. # ok to inch along
)* # Zero or more
) # End capture
\?"/sx; # Followed by a question mark and quote
# This regex is from Ovid
print "matched <$1>\n" if
$myvar =~ /" # First quote
( # Capture text to $1
(?: # Non-backreferencing parentheses
[^?"] # Not a question mark or parentheses
| # or
\?(?!") # A question mark not followed by a quote
)* # Zero or more
) # End Capture
\?"/sx; # Followed by a question mark and quote
The first regex will print matched <def"g"hi>. The second will print matched <hi>.
No disrespect is intended towards Randal as he was right in pointing out that my first regex was broken.
Cheers,
Ovid
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.
|