Stats ain't my forte, but wouldn't this only be true if tests A & B were both capable of detecting all the possible unknown bugs?

If this the case, then all you need to make this work is a sure fire way of designing tests that are guarenteed to be capable of detecting all possible bugs. Actually, you would probably need a way of designing two independant tests capable of detecting all possible bugs, as I doubt it would work if the two tests were the same:)

