The problem with the core diff tools is that they assume the two files to be in perfect order... which they aren't. You could almost think of the problem as being a comparison of two sets of lines, where I want to know which lines aren't in both sets (ignoring order). Its a nasty problem if there were truly no order to the lines, but thankfully they are mostly in the same order, with some chunks out of order by a couple hundred lines (which is nothing given the size of the files... around 1.4 million lines)