Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

So how do you include a value that contains a space and ends with a backslash?

You should change \\\2 to \\. and then decide which of three treatments you want:

  1. \x always becomes x
  2. \x stays \x except that \" becomes " and \\ becomes \
  3. \x stays \x except that \" becomes " and \\" becomes \" and \\\" becomes \\" etc.

But I find a much better method is to not use \ for escaping embedded quote characters if that is the only character you want to escape. Instead, use two adjacent quote characters to represent one embedded quote character.

That is, change \\\2 to \2\2 and then post-process the match to undouble the embedded quote characters.

One problem with this approach is if you end up nesting lots of these constructs you'll end up with:     q{one="two=""three=""""a b""""""" two=abc} but that isn't much worse than the alternative of     q{one="two=\"three=\\\"a b\\\"\"" two=abc} and allowing multiple quote characters (like you have) is the real solution to such problems     q{one="two='three=`a b`'" two=abc} and avoiding a single escape character is why I prefer my approach.

Update: I wouldn't use a non-greedy match. I'd also be more strict so the regex engine doesn't have any option about matching things other than the way I want it to. So in your original code [^\2] should be [^\\\2] (though I recall [^\2] not working when I tested it so perhaps this means that your code won't work on older versions of Perl).

You don't want the regex engine to decide to look at 'I\'m' and match \ against [^\2] and then have the middle ' terminate the string too early. Right now this probably won't happen due to subtle rules (I assume, based on your testing -- the rules are subtle enough that I'd have guessed that the regex would go the other route) but this leeway means that the regex can backtrack when a closing quote is missing and match a different quote in the manner I describe. You don't want to allow this.

You should also allow empty strings (so change +? to just *). And I'd use [^\2]+ in hopes of being more efficient, but such concerns should be considered last.

Update2: I notice you use \t in your values but I don't see you dealing with that anywhere. Is that supposed to stay \t or become a tab? Or is that just to test that other backslashes doesn't get eaten? For that matter, I don't see where you turn \' into ' so...

And no need to backslash the quotes in a character class so you can use ["'`] instead (though it doesn't hurt either).

You might want to look at Regex::Common to compare how it does some of these things. Unfortunately, reading the code of that module is rather difficult. Luckilly, you can just print out the regexes it gives back to you instead. (:

                - tye

In reply to Re: Regex capturing either quoted strings or bare words (final backslash) by tye
in thread Regex capturing either quoted strings or bare words by gmax

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (4)
As of 2024-03-29 02:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found