in reply to Matching ampersands that are NOT part of an HTML entity?
s/ & (?! (?: # (?: x[\da-f]+ | \d+ ) | [a-z]+ ) ; ) /&/xi
I factored out the "#" and removed extraneous captures and groupings, but the key is (?!)
Update: And if you wanted to only accept known entities,
local our %known = map { $_ => 1 } qw( eacute Eacute ecirc Ecirc ... ); s/ & (?! (?: \# (?: x[\da-f]+ | \d+ ) | ([a-z]+) (?(?{ !$known{$1} }) (?!) ) ) ; ) /&/xi
or
use Regexp::List qw( ); my @known = qw( eacute Eacute ecirc Ecirc ... ); my $known = Regexp::List->new()->list2re(@known); s/ & (?! (?: \# (?: x[\da-f]+ | \d+ ) | $known ) ; ) /&/xi
Update: Escaped "#" as per reply.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Matching ampersands that are NOT part of an HTML entity?
by AnomalousMonk (Archbishop) on Aug 07, 2008 at 00:19 UTC | |
|
Re^2: Matching ampersands that are NOT part of an HTML entity?
by JavaFan (Canon) on Aug 07, 2008 at 12:02 UTC | |
by ikegami (Patriarch) on Aug 07, 2008 at 12:57 UTC |