WHolcomb has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to write a regular expression to aid in the parsing of a configuration file and I want to allow the users of the configuration file to specify any character in their directive including the special characters for the file format (comment (#), equivalance (=), etc.) and I have been trying to write an appropriate regex with no success. I am new at this and I have tried:
s/([^(?:([^\\]|\A)\\(\\{2})*\#)]*)(.*)/$1/
which represents a # not followed by an odd number of \'s (since \\# is the \ character metaquoted followed by a comment) but that didn't work becasue ^ only represents single characters and not sets of characters. I then tried the perl 5.005 negative lookbehind (?<!) but it only allows fixed width lookbehinds and I want to allow any number of \'s. Currently I am doing:
split /\Q#\E/; $_ = $_[0]; if(/\A\s*\Z/) { next; } $string = $_; for($i = 1; $i <= $#_; $i++) { $_ = $_[$i - 1]; m/(.)((\\){2})*\Z/; if("$1" eq "\\") { $string .= "\#" . $_[$i]; } else { last; } }
Can anyone suggest a regex to do all that work? I remember seeing one to correctly parse a C string somewhere which would deal with these same issues, but hard as I look I cannot find it.

Will

Originally posted as a Categorized Question.

Replies are listed 'Best First'.
Re: How do I write a regex which allows meta-quoting?
by WHolcomb (Initiate) on Apr 13, 2000 at 19:39 UTC
    Quite nearly there. All that is left is that things in brackets like array subscripts are made into links to other nodes. That ought to be fixable by replacing them with the html codes, which I don't know off the top of my head. Ahh, they are &#91; -> [ and &#93; -> ]

    To the monks who maintain this monestary I might suggest that they have the node linking ignore []'s inside <pre>'s.
    s/(^[(?:([^\\|\A)\\(\\{2})*\#)]*)(.*)/$1/
    
    (?<!)
    
    $c = "\#";
    $m = "\\";
    
    while(<IN>) {
      chomp;
      split /\Q$c\E/;
      $_ = $_[0];
      next if(/\A\s*\Z/);
      $string = $_;
      for($i = 1; $i <= $#_; $i++) {
       $_ = $_[$i - 1];
       m/(.)((\Q$m\E){2})*\Z/;
       if("$1" eq "$m") {
         $string .= "$c" . $_[$i];
       } else {
         last;
       }
      }
    }