in reply to Re: \1, \2, \3, ... inside of a character class
in thread \1, \2, \3, ... inside of a character class
That weird regex (I don't like it) was written by one person who want to eliminate comments from HTML. He knew about HTML::Parser, but he wanted to make it with regexps for his fun. I was trying to find a valid HTML where that code fails and I noted that he used \8 as a backreference within a character class. I knew that one can use variables [$var] but using of [^\8] appeared alarming for me. Such way I obtained that probably undocumented behavior.sub eraseCommet { my($all, $comment) = @_; return $all if !$comment; } s/(<(\/)?((!--)|(script)|(style)|\w+)(?(4).*?-->|(\s+\w+(?:\s*=\s*(["' +])?(?(8)[^\8]+?\8|\S+?(?=[>\s])))?)*?\s*\/?>(?(5).*?<\/script>|(?(6). +*?<\/style>))))/eraseCommet($1,$4)/gixse;
|
|---|