in reply to Re^5: Perlre interpretation required (perfect dequote regex)
in thread Perlre interpretation required

Sorry Aristotle, but this is far from "unbreakable". In fact it doesn't even compile? You can't apply the /g modifier to qr//.

Your version

#! perl -slw use strict; require 5.008; sub make_dequote_rx { my @char = map quotemeta, @_; my $chars = join '', @char; my $rx = join( '|', map(qq[$_ (?: \\ . | $_$_ | [^$_] ) + $_], @char), qq[(?: \\ . | [^$chars] ) +], ); return qr/\G ( $rx )/x; ##You cannot apply /g to qr// } my $dequote = make_dequote_rx qw(' "); #" while( <DATA> ){ chomp; our $quoted; print; print "\t<$1>" while m[$dequote]g; print ''; } __DATA__ unquoted stuff "quoted stuff" unquoted stuff unquoted stuff 'quoted stuff' unquoted stuff unquoted stuff "quoted stuff with embedded 'alternate' quotes" unquote +d stuff unquoted stuff 'quoted stuff with embedded "alternate" quotes' unquote +d stuff unquoted stuff "quoted stuff with embedded ""like"" quotes" unquoted s +tuff unquoted stuff 'quoted stuff with embedded ''like'' quotes' unquoted s +tuff unquoted stuff "quoted stuff with embedded """"like"""" quotes" unquot +ed stuff unquoted stuff 'quoted stuff with embedded ''''like'''' quotes' unquot +ed stuff unquoted stuff "quoted 'stuff' with escaped \"like\" quotes" unquoted +stuff unquoted stuff 'quoted 'stuff' with escaped \'like\' quotes' unquoted +stuff unquoted stuff "quoted stuff with embedded ""like"" quotes and escaped + \"like\" quotes" unquoted stuff unquoted stuff 'quoted stuff with embedded ''like'' quotes and escaped + \'like\' quotes' unquoted stuff

Output

D:\Perl\test>junk unquoted stuff "quoted stuff" unquoted stuff <unquoted stuff "quoted stuff> unquoted stuff 'quoted stuff' unquoted stuff <unquoted stuff 'quoted stuff> unquoted stuff "quoted stuff with embedded 'alternate' quotes" unquote +d stuff <unquoted stuff "quoted stuff with embedded 'alternate> unquoted stuff 'quoted stuff with embedded "alternate" quotes' unquote +d stuff <unquoted stuff 'quoted stuff with embedded "alternate> unquoted stuff "quoted stuff with embedded ""like"" quotes" unquoted s +tuff <unquoted stuff "quoted stuff with embedded "> <"like"" quotes"> < unquoted stuff> unquoted stuff 'quoted stuff with embedded ''like'' quotes' unquoted s +tuff <unquoted stuff 'quoted stuff with embedded '> <'like'' quotes'> < unquoted stuff> unquoted stuff "quoted stuff with embedded """"like"""" quotes" unquot +ed stuff <unquoted stuff "quoted stuff with embedded "> <"""like"""" quotes"> < unquoted stuff> unquoted stuff 'quoted stuff with embedded ''''like'''' quotes' unquot +ed stuff <unquoted stuff 'quoted stuff with embedded '> <'''like'''' quotes'> < unquoted stuff> unquoted stuff "quoted 'stuff' with escaped \"like\" quotes" unquoted +stuff <unquoted stuff "quoted 'stuff> unquoted stuff 'quoted 'stuff' with escaped \'like\' quotes' unquoted +stuff <unquoted stuff 'quoted 'stuff> <' with escaped \'> <like\> <' quotes'> < unquoted stuff> unquoted stuff "quoted stuff with embedded ""like"" quotes and escaped + \"like\" quotes" unquoted stuff <unquoted stuff "quoted stuff with embedded "> <"like"" quotes and escaped \"> <like\> <" quotes"> < unquoted stuff> unquoted stuff 'quoted stuff with embedded ''like'' quotes and escaped + \'like\' quotes' unquoted stuff <unquoted stuff 'quoted stuff with embedded '> <'like'' quotes and escaped \'> <like\> <' quotes'> < unquoted stuff>

Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller

Replies are listed 'Best First'.
Re^7: Perlre interpretation required (remember kids: test all code..)
by Aristotle (Chancellor) on May 04, 2003 at 13:35 UTC
    Have you tried the first snippet? It is unbreakable. I didn't test the second though, sigh. I forgot that I'm using the doublequote-like op to put together a regex, so I need to double the backslash both for the regex as well as for the doublequotes, meaning I need four backslashes to begin with in order to get one literal one in the end. Let me fix that..

    Makeshifts last the longest.

      I guess I must be using it wrong, or the two routines aren't meant to acheive the same thing. As-is, yours does find and match on the quoted sections and handled the embedded quotes ok -- although it does leave the leading and trailing quotes in place -- but it also matches on the unquoted sections?

      I get the impression that we are both acheiving our goals adequately, they are just different goals:)

      Testcode

      Results


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
        Ah, yes. Didn't look that close. Mine matches all the pieces of a string that "belong together". The surrounding quotes are left intact because the inner escaped sequences are, too, so you need to know what kind of string you got to unescape the inner sequences. The resulting s/// cleanup regex looks very similar to the finder regex I'm using.

        Makeshifts last the longest.