http://qs1969.pair.com?node_id=59888

Following the Huffman encoder posted here yesterday, here come the always useful Huffman decoder.

$/=$\;($l,$_)=unpack'sB*',<>;s/..//;r($&);$r=join'|',sort{$0{$b}cmp$0{ +$a}}keys %0;$_=substr$_,0,$l;s/$r/$0{$&}/g;sub r{@_=split//,pop;for$i(0,1){if($ +_[$i]){s /.{8}//s;$0{$p.$i}=pack'b8',$&}else{local$p=$p.$i;s/..//;r($&)}}}print

It is much shorter (229 bytes in Un*x), and as you will see, not much documented.

Replies are listed 'Best First'.
Huffman decoder SOLUTION
by BooK (Curate) on Feb 21, 2001 at 13:47 UTC

    Cet assombrissement est soumis au nom de Paris.pm canal assombri.

    By now, you already know what this does. It is a Huffman decompressor.

    As always, $/ is first undefined. Then we fetch the length of the compressed data, and the concatenation of the encoded tree and the compressed data (plus the padding). See hfm.pl to see how to create compressed data.

    The r subroutine take a two character long string (which m/^[01]{2}$/), splits it and follows the Huffman tree structure (taking the next 8 bits when a leaf (i.e. a character) is needed). If we have not reached a leaf, then the subroutine is called recursively, until the whole tree has been traversed.

    $r is a regular expression that will be used to replace any string of 0's and 1's by the corresponding character in the Huffman tree. Please note that we sort the strings in the regular expression so that mismatch is impossible.

    After the substition is done, $_ is printed.

(tye)Re: Huffman decoder
by tye (Sage) on Feb 22, 2001 at 00:47 UTC

    binmode might be required for portability.

            - tye (but my friends call me "Tye")
      Not might Tye. Will. Damn binmode crap....

      Yves
      --
      You are not ready to use symrefs unless you already know why they are bad. -- tadmc (CLPM)