in reply to Using negative lookahead

Another take on it:

Unfortunately regexes don't support backreferences in character classes - like [^\g1] - to forbid the delimiter inside the string. (at least I couldn't find it.)

But it's possible to have the same effect with negative lookaheads

DB<34> $_='abbbbbbbbbba' DB<35> x m/^(.) ( (?!\1) . )* \1$/x 0 'a' 1 'b'

NB a lookahead doesn't move the position, that's why it has to be moved with an .

And this approach seems to work in your code:

#!/usr/bin/perl use strict; use warnings; my @cases = ( q{'abc"def'}, q{'abc'}, q{"abc"}, q{''}, q{'abc'def'}, # Want this to fail matching q{'This shouldn't match'}, # Want this to fail matching q{"This isn't a problem"}, q{"abc}, q{abc"}, q{abc}, q{'abc"}, q{'ab''}, # Want this to fail matching ); strip_quotes($_) for @cases; # If we can remove a matching pair of single or double quotes from # a string, without the quote symbol also appearing within the string, # do so. Otherwise don't change the string. sub strip_quotes { my $line = shift; print "\n$line\n"; # NO NEGATIVE LOOKAHEAD # This works except it allows an embedded delimiter if ( $line =~ m{^ # anchor ( # capture delimiter in pos 1 ["'] # delim is single or double quote ) (.*) # anything \g1$}x # finally, the delim ) { print " 1- Got a match: delimiter was {$1}, body was {$2}\n"; } else { print " 1- No match.\n"; } # ATTEMPTING NEGATIVE LOOKAHEAD # This should fail if the delimiter is found in non-terminal pos. if ( $line =~ m{^ # anchor start ( # capture delimiter in pos 1 ["'] # delim is single or double quote ) ( (?: # --- negate backrefrence (?!\g1) # following letter is not delim . # consume following letter )* ) \g1 # finally, the delim $ # anchor end }x ) { print " 2- Got a match: delimiter was {$1}, body was {$2}\n"; } else { print " 2- No match.\n"; } }

'abc"def' 1- Got a match: delimiter was {'}, body was {abc"def} 2- Got a match: delimiter was {'}, body was {abc"def} 'abc' 1- Got a match: delimiter was {'}, body was {abc} 2- Got a match: delimiter was {'}, body was {abc} "abc" 1- Got a match: delimiter was {"}, body was {abc} 2- Got a match: delimiter was {"}, body was {abc} '' 1- Got a match: delimiter was {'}, body was {} 2- Got a match: delimiter was {'}, body was {} 'abc'def' 1- Got a match: delimiter was {'}, body was {abc'def} 2- No match. 'This shouldn't match' 1- Got a match: delimiter was {'}, body was {This shouldn't match} 2- No match. "This isn't a problem" 1- Got a match: delimiter was {"}, body was {This isn't a problem} 2- Got a match: delimiter was {"}, body was {This isn't a problem} "abc 1- No match. 2- No match. abc" 1- No match. 2- No match. abc 1- No match. 2- No match. 'abc" 1- No match. 2- No match. 'ab'' 1- Got a match: delimiter was {'}, body was {ab'} 2- No match.

Cheers Rolf
(addicted to the Perl Programming Language and ☆☆☆☆ :)
Je suis Charlie!

Replies are listed 'Best First'.
Re^2: Using negative lookahead
by ibm1620 (Hermit) on Oct 19, 2017 at 23:17 UTC
    Actually this looks simpler than the earlier solution. And I get the bit about moving the position with '.'.
Re^2: Using negative lookahead
by ibm1620 (Hermit) on Oct 20, 2017 at 13:51 UTC
    "Unfortunately regexes don't support backreferences in character classes - like [^\g1] - to forbid the delimiter inside the string. (at least I couldn't find it.)"

    I wouldn't think that would work in the general case, where \g1 refers to more than one character, would it?

      depends what you mean with the general case.

      Do you mean ...

      • ... [^\g1] with \g1 more than one letter?
      This is hypothetical, since even one letter doesn't work.
      • ... ( (?! \g1) .)*
      This Would disallow a multibyte sequence if the match holds a word.

      I.e. like the word "not" to be forbidden to follow

      • ... ( (?! \g1) (?! \g2) .)*
      Here chaining look-aheads work like AND conditions.

      For single bytes, this would be equivalent of [^\g1\g2] (if it was possible)

      you might be interested this excellent tutorial

      Using Look-ahead and Look-behind

      Cheers Rolf
      (addicted to the Perl Programming Language and ☆☆☆☆ :)
      Je suis Charlie!