Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re: Compressing and Encrypting files on Windows

by tachyon (Chancellor)
on Nov 01, 2004 at 14:03 UTC ( [id://404306] : note . print w/replies, xml ) Need Help??


in reply to Compressing and Encrypting files on Windows

Doing it in memory is by far the easiest thing. It is also simple with one line calls to decrypt/encrypt and compress/decompress. This uses Crypt::Blowfish but you could use your favourite block cipher. The order is VITAL. plaintext->compress->crypt->file is the only way to go (reverse on the way out of course).

By definition a decent encryption algorithm will turn '0'x1000000 into random (and thus totally uncompressable noise). On the other hand a compression algorithm will compress the hell out of the plaintext due to the repeated pattern. Then you can crypt that. For interest reverse the order of crypt<->compress and you will see no compression at all. This is a simple measure of the quality of the encryption - the compression algorithm can no longer see any patterns (and we know the plaintext has patterns). Crap encryption WILL compress. Try ROT13, or bongo crypt ;-)

$|++; use Compress::Zlib ; use Crypt::CBC; my $cipher = Crypt::CBC->new( {'key' => 'secret key', 'cipher' => 'Blo +wfish', }); my $file = 'c:/text.txt'; write_file( $file, $cipher, "abcbefghijklmnopqrstuvwxyz\n"x1000 ); print "\n\nDecrypting and decompressing\n\n"; my $data = read_file( $file, $cipher ); printf "Got back %d bytes\n%s\n[snip]", length($data), substr($data,0, +64); sub write_file { my ( $file, $cipher, $data ) = @_; printf "Before compression %d bytes\n", length($data); $data = compress($data); printf "After compression %d bytes\n", length($data); $data = $cipher->encrypt($data); printf "After encryption %d bytes\n", length($data); open my $fh, '>', $file or die $!; print $fh $data; close $fh; print map{ s/[^\040-\177]//g;$_ }$data; } sub read_file { my ( $file, $cipher ) = @_; open my $fh, $file or die $!; local $/; my $data = <$fh>; close $fh; $data = $cipher->decrypt($data); $data = uncompress($data); return $data; } __DATA__ Before compression 27000 bytes After compression 116 bytes After encryption 136 bytes RandomIV#vTP;\\-Ij{Et@t--sC7SWy{<6P~)}y9p6_r$0bB(d1$k8:6, Decrypting and decompressing Got back 27000 bytes abcbefghijklmnopqrstuvwxyz abcbefghijklmnopqrstuvwxyz abcbefghij [snip]

cheers

tachyon

Replies are listed 'Best First'.
Re^2: Compressing and Encrypting files on Windows
by jon_barber (Initiate) on Nov 02, 2004 at 11:23 UTC
    By definition a decent encryption algorithm will turn '0'x1000000 into random (and thus totally uncompressable noise).
    ...
    This is a simple measure of the quality of the encryption - the compression algorithm can no longer see any patterns (and we know the plaintext has patterns).

    I don't think this is true of one time pads. I can imagine a situation where a OTP could produce output that could then be highly compressed. Of course, this is a special case nitpick :).

        What on earth makes you think data encrypted with a one time pad contains redundant data repeats and can thus be compressed? Please see Elgons note at Re^2: Compressing and Encrypting files on Windows and follow the link.

        What makes you think that a OTP pad doesn't contain redundant repeats? As I understand it, a OTP operates by applying randomly generated data (the OTP) to the data that you wish to encrypt. To decrypt it you then reverse the transformation with the same OTP.

        Given that the OTP is randomly generated, is it not possible to imagine that it might produce a cyphertext which did contain redundant data repeats and could therefore be highly compressable?

        As I stated in my original comment, it was a nitpick over the definition that "a decent encryption algorithm would produce ... noise."