Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery

Re: Self-extracting compressed code!

by Anonymous Monk
on May 22, 2002 at 11:48 UTC ( #168400=note: print w/replies, xml ) Need Help??

in reply to Self-extracting compressed code!

This is a really cool post.


use Text::Quote; Text::Quote->quote_prop('compress_at',1); undef $/; print "use Text::Quote;\nundef\$/;\nprint eval(<DATA>);\n__DATA__\n".T +ext::Quote->quote(<>);
Which is not meant to detract from your very cool code. This one cheats for sure. :-)

Replies are listed 'Best First'.
Re: Re: Self-extracting compressed code!
by demerphq (Chancellor) on May 22, 2002 at 12:11 UTC

    I wrote this.

    The reason its an Anony Monk is just so stupid Im not even going to tell you what happened.

    But to follow up, id gladly make Text::Quote fallback to your module if Compress::ZLib wasn't handy.

    Yves / DeMerphq
    Writing a good benchmark isnt as easy as it might look.

      Cool. I hadn't seen Text::Quote, much less realized from its unassuming name that it did compression. I'm curious -- what problem or itch inspired you?

      As for using the self-extracting code as a fallback, the algorithms requiring Huffman coding (bzip and Huffman) are probably not a good choice, since they require a priority queue module, which you're less likely to have than Compress::Zlib. I've only done "compressed", stand-alone versions of the decompression routines. Indeed, I hadn't thought about it as a zlib alternative on the compression end of things.

      The idea is that while I may have a bunch of strange modules on my system, the system where my script is running may just have the bare bones. This way I can compress the script, then use it elsewhere without adding any new module dependencies (like those Stuffit self-extracting archives that were so popular before all compression software was free). So if you changed T::Q to use this, you'd probably want to give the user the option of using it even if they happen to have Compress::Zlib on the system where T::Q is running.


        what problem or itch inspired you?

        Good question. Basically I wanted to isolate and clean up the quoting code from Data::Dump in that there is code for converting binary blocks into Base64 streams. I figured that if there was already going to be code to do base64 then I might as well compress on the fly as well. This was all for Data::BFDump where I am usually more interested in structure than content, and having ways to represent and quote large chunks of text in an efficient way appealed to me. Although I suppose it could be argued that I got carried away.

        since they require a priority queue module

        Hmm. I dont know anything like as much about compression as you, but I spent a while playing around with the Huffman algorithm in perl. When exactly does the priority queue (heap i assume) become critical? For instance in the testing I was doing I was building huffman trees for thousands of elements, and even then initially sorting the elements and an algorithm I worked out meant really fast run times without using a priority queue. Why is the queue so critical? In a standard compression my understand was that usually there arent that many elements or symbols. With a huffman algorithm if the initial list of elements to be encoded is inorder the tree can be constructed with almost no "sort" passes at all. (in fact I wouldn't be suprised if the algorithm I came up with outperforms an algorithm using a perl based heap.) If you're interested Ill post the algorithm and my perl code.

        Also I have a brain teaser for you: (originally posted as A Golf question maybe? Huffman with a twist and never answered, but that could have been due to a bad description). There exists an algorithm that can take a huffman tree and reorganize it such that the longest paths are all one side of the tree and the shortest are on the other. (in otherwards the longest path is for the least likely element and the shortest for the most and all of the paths are sorted from left to right by their frequency.) This algorithm runs in linear time (O(N)). Can you find it?


        Yves / DeMerphq
        Writing a good benchmark isnt as easy as it might look.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://168400]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (2)
As of 2022-09-28 21:24 GMT
Find Nodes?
    Voting Booth?
    I prefer my indexes to start at:

    Results (125 votes). Check out past polls.