in reply to Re^4: Calculating corruption
in thread Calculating corruption

furthermore, there are a few different revisions of this said encrypted file. each of these different revisions have an expected outcome. once you compute the files std dev and compare that with the known expected values, usually if it is within a generally close range, then that means the file is not corrupted.

That makes no sense at all.

Let's say the corruption that occurred was that every pair of bytes in the file was transposed -- eg. abcdefgh corrupted to badcfehg; a type of corruption that frequently occurs when files are written on big-endian machines and read on little-endian ones, or vice versa.

Pretty much every type of statistical analysis applied to the bytes of the file will simply not change at all.

Equally, if you change any 1 bit in any one byte of a (say) 1MB file, you'd need to calculate your standard deviation to an accuracy beyond the limits of double precision in order to detect the change.

You're flogging a dead horse.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^6: Calculating corruption
by james28909 (Deacon) on Oct 19, 2014 at 17:57 UTC
    i never once said this was a way to tell if the file was corrupt 100% or not. as a matter of fact i said there is no way to tell. ALSO, i said that there is an expected outcome that is measured from know good files. as long as you fall within that range, it is an indicator the file /MAY/ be good, NOT 100% corruption free. your trying to prove a point that everyone knows.

    furthermore, you can run statistical ananlysis, and get the percentage inwhich each byte character (0x00 - 0xFF) show up in the file, and likewise... if it is below or above a certain range, you can just about put all bets the file is corrupt :)

    anyways, im nto going to argue with you anymore. i know what your talking about sir, and i completely understand what you are saying. there is no true way to tell, even if the entropy range, percentage analysis of byte characters, std dev are within the known ranges. but if they are within this certain range sir, i would put my money on the file being "ok" or corruption "free".

    If this didnt work and did not help, then there wouldnt be older tools that people still use to this day. all i am trying to do is replicate these tools/functions. The main goal of this method is, if any of the calculations fall outside of this expected range, then it should make flags go off in your head and signify furthermore that the file is no good and if you use it, then you will surely be messing up. :)
      but if they are within this certain range sir, i would put my money on the file being "ok" or corruption "free".

      You'd lose your money.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        probability says i might or might not sir
        ill give you some files and proof tomorrow if i can get some free time or the next day. and the main reason is, because i know everybody wants to see this. i may not be a seasoned professional when it comes to programming, but i am seasoned when it comes to what i have done hunndereds of times before, based off of an expected outcome.