The regular expression: (?-imsx:(&(?!(?:amp|lt|gt);)|[><])) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- ( group and capture to \1: ---------------------------------------------------------------------- & '&' ---------------------------------------------------------------------- (?! look ahead to see if there is not: ---------------------------------------------------------------------- (?: group, but do not capture: ---------------------------------------------------------------------- amp 'amp' ---------------------------------------------------------------------- | OR ---------------------------------------------------------------------- lt 'lt' ---------------------------------------------------------------------- | OR ---------------------------------------------------------------------- gt 'gt' ---------------------------------------------------------------------- ) end of grouping ---------------------------------------------------------------------- ; ';' ---------------------------------------------------------------------- ) end of look-ahead ---------------------------------------------------------------------- | OR ---------------------------------------------------------------------- [><] any character of: '>', '<' ---------------------------------------------------------------------- ) end of \1 ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------