in reply to Re^2: Question for regex experts
in thread Question for regex experts

You need to implement a lookup table that contains a list of all possible, or otherwise a subset, of HTML character codes.

Therefore, each time your program encounters

code;
instead of
&code;
, it will need to add an ampersand at the start.