How about doing a diff on the two files then dividing the size of the diff with the size of the file to give you a percentage of similarity. You might want to prune the diff output so its only giving the results from the comparison file otherwise you're going to have a little twice the expected size in the diff as both results are posted plus some fluff from Diff.
This is more relavant than just comparing the size of the files which is obviously not a solution in that is does give consideration to content.