in reply to Re^3: Calculating corruption
in thread Calculating corruption

i am not trying to argue with you at all so please dont take it personally :)

but actually upon further examination, one of the programs i have used in the past, has this std dev function. furthermore, there are a few different revisions of this said encrypted file. each of these different revisions have an expected outcome. once you compute the files std dev and compare that with the known expected values, usually if it is within a generally close range, then that means the file is not corrupted. and i am not saying this is the end all be all of how to check a file for corruption, but somehow this other program is able to compute it and it is within a reasonably expected range... everytime... and per revision of the file, unless the file is corrupted. maybe i need to script up something real quick and just check to see what the outcome will be :)

Replies are listed 'Best First'.
Re^5: Calculating corruption
by BrowserUk (Patriarch) on Oct 19, 2014 at 10:34 UTC
    furthermore, there are a few different revisions of this said encrypted file. each of these different revisions have an expected outcome. once you compute the files std dev and compare that with the known expected values, usually if it is within a generally close range, then that means the file is not corrupted.

    That makes no sense at all.

    Let's say the corruption that occurred was that every pair of bytes in the file was transposed -- eg. abcdefgh corrupted to badcfehg; a type of corruption that frequently occurs when files are written on big-endian machines and read on little-endian ones, or vice versa.

    Pretty much every type of statistical analysis applied to the bytes of the file will simply not change at all.

    Equally, if you change any 1 bit in any one byte of a (say) 1MB file, you'd need to calculate your standard deviation to an accuracy beyond the limits of double precision in order to detect the change.

    You're flogging a dead horse.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      i never once said this was a way to tell if the file was corrupt 100% or not. as a matter of fact i said there is no way to tell. ALSO, i said that there is an expected outcome that is measured from know good files. as long as you fall within that range, it is an indicator the file /MAY/ be good, NOT 100% corruption free. your trying to prove a point that everyone knows.

      furthermore, you can run statistical ananlysis, and get the percentage inwhich each byte character (0x00 - 0xFF) show up in the file, and likewise... if it is below or above a certain range, you can just about put all bets the file is corrupt :)

      anyways, im nto going to argue with you anymore. i know what your talking about sir, and i completely understand what you are saying. there is no true way to tell, even if the entropy range, percentage analysis of byte characters, std dev are within the known ranges. but if they are within this certain range sir, i would put my money on the file being "ok" or corruption "free".

      If this didnt work and did not help, then there wouldnt be older tools that people still use to this day. all i am trying to do is replicate these tools/functions. The main goal of this method is, if any of the calculations fall outside of this expected range, then it should make flags go off in your head and signify furthermore that the file is no good and if you use it, then you will surely be messing up. :)
        but if they are within this certain range sir, i would put my money on the file being "ok" or corruption "free".

        You'd lose your money.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.