in reply to Need a regex to replace incomplete html entities
> I basically want to replace the string &,&#,,& to blank, but it should not replace &
This meets your "requirements" and is IMHO easier to understand and more intuitive than tybald89's solution
DB<19> p $test &,&#,,&,&&,&#,,&,& DB<20> p $test =~ s/(&#\d+;)|&#?\d*/$1/gr ,,,,&,,,,&
The trick is to first match correct entities and leave them unchanged by replacing them with themselves.
Incorrect entities are then found by backtracking and replaced with an empty $1.
Please handle this with care, I'm not sure if your requirements didn't miss edge cases.
NB: yes, it will also replace &38 without #
DB<22> p ',&38,' =~ s/(&#\d+;)|&#?\d*/$1/gr ,,
otherwise you can add other or-conditions to exclude this case.
Cheers Rolf
(addicted to the Perl Programming Language and ☆☆☆☆ :)
Je suis Charlie!
replaced s/(&#?\d*;)|... with s/(&#\d+;)|
|
|---|