Possible Hack
I think one way to do this may be to make the assumption that all JavaScript regular expressions follow after an equal "=" sign or a left-parenthesis "(".
$data =~ s{ # First, we'll list things we want
# to match, but not throw away
(
(?: # Match RegExp
[\(=]\s* # start with ( or =
/ [^\r\n\*\/][^\r\n\/]* / # All RegExps start and end
# with slash, but first one
# must not be followed by *
# and cannot contain newline
# chars
#
# var re = /\*/;
# a = b.match (/x/);
)
| # -or-
[^"'/]+ # other stuff
| # -or-
(?:"[^"\\]*(?:\\.[^"\\]*)*" [^"'/]*)+ # double quoted string
| # -or-
(?:'[^'\\]*(?:\\.[^'\\]*)*' [^"'/]*)+ # single quoted constant
)
|
# or we'll match a comment. Since it's not in the
# $1 parentheses above, the comments will disappear
# when we use $1 as the replacement text.
/ # (all comments start with a slash)
(?:
\*[^*]*\*+(?:[^/*][^*]*\*+)*/ # traditional C comments
| # -or-
/[^\n]* # C++ //-style comments
)
}{$1}gsx;
Does anyone know how to improve on this or how to make it fail? |