satanklawz has asked for the wisdom of the Perl Monks concerning the following question:

Greetings all.

Granted, I am a beginner in perl, and in the past 2 week I have come a long way :)

Here is my question to the fellow monks who reside here…

How would I compress streaming data in perl? I know how to encrypt it using blowfish in the technique described here. But I wouldn’t know where to begin in concerns with compression.

What my end goal is, is to have data encrypted, store it locally, and send it to another computer compressed and in a data stream.

Any suggestions as to how I should tackle this? Like I said, I already have the encryption part tackled, I’m now working on the compression.

Thanks for any suggestions, code examples, etc etc.

Russ

Edit ar0n -- fixed formatting

Replies are listed 'Best First'.
Re (tilly) 1: Streaming Compression
by tilly (Archbishop) on Dec 17, 2001 at 19:10 UTC
    First of all an important technical detail. You say that you want to encrypt and then compress. Unfortunately in order to compress you need to find and take advantage of patterns in the data - and encryption tries to make sure there are no understandable patterns. Therefore you need to compress and then encrypt.

    For a compression library I would use Compress::ZLib. If you want to stream you will need to use deflateInit/deflate/flush for compression and inflateInit/inflate for decompression. (Read the documentation for details - I don't have time at the moment to hack up some pseudocode.)

    One warning. You need to think through what your buffering strategy is. Compression and encryption work best when they get as much data as they want to do a chunk of work. Streaming data often cares about latency. The two goals conflict - if a message halfway fills a compression buffer and you let compression work like it wants to, the message won't get sent for an indefinite amount of time.

    Two articles that may help you understand the warning. See Suffering From Buffering and then It's the Latency, Stupid. If you read those and don't follow my point, just ask.

    UPDATE
    A random tip which is not quite worth a new post. If you are on Windows, you will need to look at binmode.

      hey- good idea. but the reason why i wanted to do encryption first is to give the compression more data to work with to find similarities with. grandted that encryption garbles up the file.... this presents an interesting problem.
      russ
        If you are using a half-way decent encryption algorithm, I guarantee you that the only way a compression algorithm can significantly compress your data later is if it breaks your encryption. By contrast if you compress your data first you will get compression and also make your encryption harder to break.

        But if you want, try coding it up both ways and see for yourself. I would offer a bet that my way works better, but I dislike stealing money on a sucker bet.

        the reason why i wanted to do encryption first is to give the compression more data to work with to find similarities with.

        Ain't gonna happen. What makes strong encryption strong is the lack of similarities between different parts of the data. Ideal encryption would produce output completely indistinguishable from random noise. Compression algorithms look for patterns in the data they're compressing. So do codebreakers.

        This brings up an interesting question: could one use a compression utility to test the strength of various ciphers? I'd guess that most compression algorithms would look for very naive (cryptographically speaking) patterns, so the "compression test" would only weed out particularly simple ciphers, but I don't have any data or citations to back it up.

        --
        :wq
Re: Streaming Compression
by miyagawa (Chaplain) on Dec 17, 2001 at 11:38 UTC
Re: Streaming Compression
by Juerd (Abbot) on Dec 17, 2001 at 11:39 UTC
    Try gz, here's why:
    2;0 juerd@ouranos:~/tmp$ echo -n 'Hello, ' | gzip -c > test.gz 2;0 juerd@ouranos:~/tmp$ echo 'World!' | gzip -c >> test.gz 2;0 juerd@ouranos:~/tmp$ zcat test.gz Hello, World!

    Update
    crazyinsomniac wanted me not to post anything *nix-centric or not about Perl.
    So here goes:
    Try $a_streamable_compression_method, here's why:
    use Some::Module; $scalar = compress("Hello, "); $scalar .= compress("World!\n"); print uncompress($scalar); # Hello, World!\n
    On CPAN, there are several gzip-capable modules.
    2;0 juerd@ouranos:~$ perl -e'undef christmas' Segmentation fault 2;139 juerd@ouranos:~$