I have a problem where I take a text document, do a bunch of regex substitutions, write the munged text to a new file, and then run a data mining application over the new file. The data mining application returns byte offsets for data that it has extracted which can be used to highlight information in the text. My problem is the byte offsets are for the munged text, but I need to be able to highlight stuff in the original text. I am stumped as to a way to reverse the regex substitutions in order to get the original offsets.