in reply to Re: Re: Re: Extracting C Style Comments Revised (JavaScript)
in thread Extracting C Style Comments Revised (JavaScript)

Possible Hack

I think one way to do this may be to make the assumption that all JavaScript regular expressions follow after an equal "=" sign or a left-parenthesis "(".

$data =~ s{ # First, we'll list things we want # to match, but not throw away ( (?: # Match RegExp [\(=]\s* # start with ( or = / [^\r\n\*\/][^\r\n\/]* / # All RegExps start and end # with slash, but first one # must not be followed by * # and cannot contain newline # chars # # var re = /\*/; # a = b.match (/x/); ) | # -or- [^"'/]+ # other stuff | # -or- (?:"[^"\\]*(?:\\.[^"\\]*)*" [^"'/]*)+ # double quoted string | # -or- (?:'[^'\\]*(?:\\.[^'\\]*)*' [^"'/]*)+ # single quoted constant ) | # or we'll match a comment. Since it's not in the # $1 parentheses above, the comments will disappear # when we use $1 as the replacement text. / # (all comments start with a slash) (?: \*[^*]*\*+(?:[^/*][^*]*\*+)*/ # traditional C comments | # -or- /[^\n]* # C++ //-style comments ) }{$1}gsx;

Does anyone know how to improve on this or how to make it fail?