> I basically want to replace the string &,&#,,& to blank, but it should not replace &
This meets your "requirements" and is IMHO easier to understand and more intuitive than tybald89's solution
DB<19> p $test &,&#,,&,&&,&#,,&,& DB<20> p $test =~ s/(&#\d+;)|&#?\d*/$1/gr ,,,,&,,,,&
The trick is to first match correct entities and leave them unchanged by replacing them with themselves.
Incorrect entities are then found by backtracking and replaced with an empty $1.
Please handle this with care, I'm not sure if your requirements didn't miss edge cases.
NB: yes, it will also replace &38 without #
DB<22> p ',&38,' =~ s/(&#\d+;)|&#?\d*/$1/gr ,,
otherwise you can add other or-conditions to exclude this case.
Cheers Rolf
(addicted to the Perl Programming Language and ☆☆☆☆ :)
Je suis Charlie!
replaced s/(&#?\d*;)|... with s/(&#\d+;)|
In reply to Re: Need a regex to replace incomplete html entities
by LanX
in thread Need a regex to replace incomplete html entities
by Chris Daniel
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |