Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
Inspired by BooK's quite excellent obfuscated Huffman Encoder (434 chars on a w32 box), and many hours of my time playing with the algorithm I decided I would make an attempt at my own, and try to shave a characters off of the solution. After quite a bit of effort (this obfus stuff is hard.. ++ to those of you who make it look so easy) getting bitten by binmode, discovering which global variables could be used in exotic ways I wish to present my solutions: (All of the solutions expect input and ouput over STDIN and STDOUT)

My first solution is the equivelent to BooK's original encoder, ie it only encodes the message, without a header containing the information necessary to decode the message. It is 316 characters long (on a w32 box).

map{binmode$_}\*STDIN,\*STDOUT;sub r{my($n,$s,$c)=@_;$c=$$n[0];return$ +1{$c}=$s if!ref($c);map r($$c[$_],$s.$_),0,1;}sub h{$/=\1;$0{$_}++,$,.=$_ while +<>;@_=map [$_,$0{$_}],keys%0;while(@_>1){@_=sort{$$b[1]<=>$$a[1]}@_;$l=pop;$r=po +p;push@_ ,[[$l,$r],$$l[1]+$$r[1]];}r pop;$,=~s/./$1{$&}/gs;pack"SB*",$.,$,;}pri +nt h;
My next solution is the same thing but this time with the header information. I call it, it is 337 bytes long.
$"=$_;map{binmode$_}\*STDIN,\*STDOUT;sub r{my($n,$s,$t,$c)=@_;$c=$$n[0 +];return $,++,$_{$c}=$s,push@0,pack"aCB16",$c,$t,$s if!ref($c);map r($$c[$_],$s +.$_,$t+1 ),0,1}sub h{$/=\1;$0{$_}++,$;.=$_ while<>;@_=map[$_,$0{$_}],keys%0;whi +le(@_>1) {@_=sort{$$b[1]<=>$$a[1]}@_;$l=pop;$r=pop;push@_,[[$l,$r],$$l[1]+$$r[1 +]]}r pop ;$;=~s/./$_{$&}/gs;chr($,)."@0".pack("SB*",$.,$;)}print h
The next is the corresponding decoder, as might be guessed I call it, and it is 259 bytes long. The output of the encoder can be piped into this script and the result will be the original file. such as by
perl < | perl
(usage on *nix machines might be little more graceful..)
$/=$_;map{binmode$_}\*STDIN,\*STDOUT;while($"=<>){map{$/=substr$",0,4, +'';($,, $;)=unpack"aC",$/;$_{(unpack"aCB$;",$/)[2]}=$,}1..ord substr$",0,1,''; +($,,$") =unpack"SB*",$";map{$;=1;$;++ until exists$_{substr$",0,$;};$\.=$_{sub +str$",0 ,$;,""}}1..$,;print""}
And for my last trick, I humbly present the combination of the two. This version I call and is 622 bytes long. Like the others it uses STDIN and STDOUT but also takes an argument to determine if it decodes or encodes (no arguments at all).
perl < | perl -d
$"=$/=$_;map{binmode$_}\*STDIN,\*STDOUT;sub r{my($n,$s,$t,$c)=@_;$c=$$ +n[0]; return$,++,$_{$c}=$s,push@0,pack"aCB16",$c,$t,$s if!ref($c);map r($$c[ +$_], $s.$_,$t+1),0,1}sub h{$/=\1;$0{$_}++,$;.=$_ while<>;@_=map[$_,$0{$_}], +keys %0;while(@_>1){@_=sort{$$b[1]<=>$$a[1]}@_;$l=pop;$r=pop;push@_,[[$l,$r +], $$l[1]+$$r[1]]}r pop;$;=~s/./$_{$&}/gs;chr($,)."@0".pack("SB*",$.,$;)} +if( pop@ARGV){while($"=<>){map{$/=substr$",0,4,'';($,,$;)=unpack"aC",$/;$_ +{( unpack"aCB$;",$/)[2]}=$,}1..ord substr$",0,1,'';($,,$")=unpack"SB*",$" +;map{ $;=1;$;++ until exists$_{substr$",0,$;};$\.=$_{substr$",0,$;,""}}1..$, +;print ""}}else{print h}
Many thanks to the Obfu crew and especially BooK for the inspiration and the walk through of his implementation, especially in regard to the use of funky global variables.

BTW, the code is set up for 80 columns, which means if you dont want it split you need to see your wrap code length to be 81 (go figure) in the User Settings to see all of them properly.

Oh and apologies to any who think this is a waste of time as its already been done, for me it was a _great_ exercise (I think that there are imprtant lessons to be learned by doing an obfu), and I'm damn proud of it. (never even looked at BooK's Huffman Decoder!)


You are not ready to use symrefs unless you already know why they are bad. -- tadmc (CLPM)
UPDATE:Minor text changes

In reply to New Huffman Obfus by demerphq
in thread Huffman encoder by BooK

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
Domain Nodelet?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (5)
As of 2022-08-18 00:38 GMT
Find Nodes?
    Voting Booth?

    No recent polls found