in reply to Finding and hightlight information

Without knowing the nature of the substitutions you are making, or what the original file is, or how it is used, it's difficult to know whether this idea might fly or not, but here it is anyway.

When I first started coding--in assembler, many moons ago--it was quite common practice to embed blocks of 16 or 32 nop's at the end of each block of code.

These blocks of nops where called "patch areas". The idea being that if once compiled, a bug was found in a program, it was possible to patch the executable rather than having to re-build it, and these patch areas allowed for the potential that a routine needed to grow in size.

Why would you do this reather than re-build. Well, it was consider safer to patch the executable as there is considerably less likelyhood of unwittingly making other changes. Eg. Someone omits or adds a compiler switch, #define or whatever, and having fixed one bug, you suddenly start getting several others show up in completely unrelated parts of the code. I believe that the technique is still actively used in such things as satallite control software, the space shuttle etc.

Anyway, back to the question. Depending upon the nature of the substitutions you are making, you might be able to adjust the substitutions such that (most of) the offsets within the file remain the same after substitution as before. For net deletions this is fairly easy. If the replacement text is shorter than the original, you can pad it--with spaces or nulls for example.

The problem comes when the replacement is longer than the original. Depending upon the nature of the original file, and what applications are used to view/manipulate it, you might get away with adding some 'patch space' to it.

For instance: if you added 10 or 20 spaces or null bytes to the end of each line, it might give you latitude to make the substitutions and have enough play to adjust the padding at the end of the following or previous lines to compensate for the changes. Obviously this wouldn't by itself cater for all possibilities. You might need to add a few nulls to the end of each word in the original file.

Having typed all that, I think that the effort involved in getting the padding juggling algorithm correct would probably be much more than building a lookup table to do the mapping, but there you go. Only you will know if this has any merit for your situation.


Examine what is said, not who speaks.
1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
3) Any sufficiently advanced technology is indistinguishable from magic.
Arthur C. Clarke.