I have written a similar program for JavaScript years ago, and now this is for Perl!
Disclaimer: I'd still call myself a beginner Perl programmer, which means I am not familiar with every trick in the book! But I tried to write a perl program that reads another perl script and breaks it down into basic units such as symbols, strings, numbers, variables, barewords, regex, etc. It changes all the variable names to shortest possible names and then rejoins the code without unnecessary spaces. It knows to update the variable names inside double-quoted strings and backticks and inside regexes also. It removes all comments, pods, and anything that comes after __END__. It removes spaces and line breaks except when they occur inside strings or heredoc. The code becomes about 75% smaller.
What I like to see is that even after it has squeezed the spaces out, I can run the obfuscator on the already obfuscated script, and it doesn't break the code. What I don't understand yet is that it's not supposed to throw away essential parts of the code, yet for some reason if I run the program on itself, it becomes smaller and smaller. If I feed the obfuscated code to the obfuscator again, it becomes a few bytes smaller each time. And the script still runs without errors. I don't understand what's going on. (Edit: I think, I know what's going on. Get rid of the Shuffle() function, and the program will generate the shortest code possible everytime.)
I haven't done a lot of testing, but the program looks okayish. I have run it on two fairly complex scripts, and they still worked after being obfuscated, so that's good news. If you find any bugs, I would appreciate if you could share the problematic script that resulted in the errors, so I know what causes the glitch. Known problems: I know for a fact that I did not write the program to remove comments when they appear inside a multiline regex, so if you have a script with comments inside a multiline regex, I don't know how the obfuscator will handle that!
EDIT: 1/11/2024: Completely eliminated the Shuffle() function as it made the output code larger.
Made some minor changes to the CreateFile() ReadFile() functions suggested by Randal Schwartz. Thanks you!
1/12/2024: Eliminated some unnecessary tests in detecting variables which made the code bloated such as testing for $A[$B] when this can be done in two steps. Also made a decision not to replace variables such as $^A in the code, because it could cause warnings to appear when running the obfuscated code.
|
Before you run this script, you must change two variables -- $INPUT_FILE and $OUTPUT_FILE which are defined near the beginning of the file.
|
This is what this program does when it is told to obfuscate itself:
#!/usr/bin/perl -w
use 5.004;$|=1;my$c=$0;my$d="Z:\\Test.pl";my$e=1;my$f=0;print"\n\nZsol
+t's Perl Compressor & Obfuscator v1 FREEWARE";my$g=-s $c;print"\nRead
+ing file: $c ($g bytes)";my$h=ReadTheEntireFile($c); $h=~tr|\r\n\t\x
+20-\x7E||cd;my$i=$g-length($h);print"\nBinary characters removed: $i"
+;$h=~s/\r\n/\n/g; $h=~tr|\r|\n|;my$j=$h=~tr|\n|\n|;print"\nNumber of
+lines: ",($j+1);my$k=' a b _ 0 1 2 3 4 5 6 7 8 9 ENV ARG ARGV ARGVOUT
+ PID GID EGID UID EUID SUBSEP F INC ISA OSNAME SIG BASETIME MATCH PRE
+MATCH POSTMATCH OFS NR RS ORS WARNING ERRNO PERLDB LIST_SEPARATOR PRO
+CESS_ID PROGRAM_NAME REAL_GROUP_ID EFFECTIVE_GROUP_ID REAL_USER_ID EF
+FECTIVE_USER_ID SUBSCRIPT_SEPARATOR OLD_PERL_VERSION SYSTEM_FD_MAX IN
+PLACE_EDIT PERL_VERSION EXECUTABLE_NAME LAST_PAREN_MATCH LAST_SUBMATC
+H_RESULT LAST_MATCH_END LAST_PAREN_MATCH LAST_MATCH_START LAST_REGEXP
+_CODE_RESULT OUTPUT_FIELD_SEPARATOR INPUT_LINE_NUMBER INPUT_RECORD_SE
+PARATOR OUTPUT_RECORD_SEPARATOR OUTPUT_AUTOFLUSH ACCUMULATOR FORMAT_F
+ORMFEED FORMAT_PAGE_NUMBER FORMAT_LINES_LEFT FORMAT_LINE_BREAK_CHARAC
+TERS FORMAT_LINES_PER_PAGE FORMAT_TOP_NAME FORMAT_NAME EXTENDED_OS_ER
+ROR EXCEPTIONS_BEING_CAUGHT OS_ERROR EVAL_ERROR COMPILING DEBUGGING V
+ERSION ';my@l;my$m=2000;if($e){@l=GenerateShortVariableNames($m);}pri
+nt"\nProcessing code...";my@n; my$o='';my$p=0;my$q; my$r='';my$s='';m
+y$t='';my$u=0;my$v=0;my%w; my$x=0;for(;$x< length($h);$x++){$q=$o;$o=
+substr($h,$x,1);$p=vec($h,$x,8);if($p==32||$p==9||$p==10){$v=1;next;}
+if($p==123){$t='BLOCK';push(@n,'{');$u++;next;}if($p==125){$t='BLOCK'
+;push(@n,'}');if($u>0){$u--;}else{print"\nWarning: Missing opening br
+ace.\n";}if($f&& $u==0){push(@n,"\n");}next;}if($p==60){if($s eq'='||
+($t ne'VAR'&&$t ne'NUM')){my$y=substr($h,$x,32);if($y=~m/\<\<\s*['"]{
+,1}|\<\<[a-zA-Z\_]+/){CaptureHeredoc() and next;}}}if($o eq'='&&$q eq
+"\n"){CapturePOD() and next;}if($p==47&&($t eq'WORD'&&index('])}',$s)
+<0)||($o eq'~'&&index('=!',$q)>=0)){CaptureRegex() and next;}if($p==3
+6||$p==64||$p==37){if(index('*+-[{(;,=',$s)<0){if($v&&($t eq'VAR'&&$p
+==36)||($t eq'WORD'&&index(' my our defined undef return '," $r ")<0)
+){push(@n,' ');}}CaptureVariable() and next;}if($o eq'#'&&$q ne'$'){C
+aptureComment();next;}my$z=index('[]()<>;,!?=$@%&:+*^|\\\/~-.',$o);if
+($z>=0){if($v&& $o eq'.'&&$t eq'NUM'){push(@n,' ');}push(@n,$o);$s=$o
+;$v=0;next;}if($p>47&&$p<58){if(index('<>,:;|[(%/*=+-',$s)<0){if($v&&
+((index('.',$s)>=0)||($t eq'WORD'||$t eq'VAR'))){push(@n,' ');}}Captu
+reNumber();next;}if($p==34||$p==39||$p==96){if(index('={}[]();,.',$s)
+<0){if($v&& $s eq'$'||$t eq'STR'||$t eq'VAR'){push(@n,' ');}}CaptureS
+tring();next;}if($p==95||($p>96&&$p<123)||($p>64&&$p<91)){if(index('*
++-[{(;,=',$s)<0){if($v&&($t eq'WORD'||$t eq'VAR'||$t eq'SYM')){push(@
+n,' ');}}$v=1;$s='x';my$A=CaptureWord();if($p==113&&index(' qq qw qx
+qr q '," $A ")>=0){$r=$A;CaptureQQ();next;}if($r eq'sub'){print'.';}i
+f($A eq'__END__'){pop(@n);last;}if($A eq'__DATA__'){$x++;CaptureTail(
+);last;}$r=$A;next;}}undef%w; undef$h;undef@l; print"\nSaving file: $
+d ";CreateFile($d,@n);$g=-s $d;print"($g bytes)\n\n";exit; sub Captu
+reRegex{my$B=$x;my$C='';my$D='';my$y=substr($h,$x,256);my$E=0;if($y=~
+m/^(\~\s*tr\s*)([^\t\n ]{1})/){$D='~tr';$E=length($1);$C=$2 x 2;}elsi
+f($y=~m/^(\~\s*y\s*)([^\t\n ]{1})/){$D='~y';$E=length($1);$C=$2 x 2;}
+elsif($y=~m/^(\~\s*s\s*)([^\t\n ]{1})/){$D='~s';$E=length($1);$C=$2 x
+ 2;}elsif($y=~m/^(\~\s*m\s*)([^\t\n ]{1})/){$D='~m';$E=length($1);$C=
+$2;}elsif($y=~m/^(\~?)(\s*)\//){$D=$1;$E=length($D. $2);$C='/';}else{
+$x=$B;return 0;}length($D) and push(@n,$D);$x+=$E; $C=GetSeparatorCha
+racters($C);CapturePattern($C);$t='REGEX';$r='';$s='x';$v=0;return 1;
+}sub CaptureQQ{my$G=substr($h,++$x,1);my$C=GetSeparatorCharacters($G)
+;CapturePattern($C);$t='REGEX';$r='';$s='x';$v=0;return 1;}sub Captur
+ePattern{my$C=shift;my$G=substr($C,0,1);my$H=substr($C,1,1);my$I=(len
+gth($C)==2)?1:2;my$J=$G eq $H;if($J){$I++;$C=$G;}my$K=0;my$L=0;my$B=$
+x;my$M='';for(;$x< length($h);$x++){my$o=substr($h,$x,1);$M.=$o; if((
+$o eq'\\'&&($K& 1)==0)||$o eq'$'||$o eq'@'){my$y=substr($h,$x,500);if
+($y=~m/^([\$\@]{1}[a-zA-Z\_]{1}[a-zA-Z\_0-9]*)/){my$N=$1;$x+=length($
+N)-1;$M.=ReplaceVariable($N);}elsif($y=~m/^(Q\s*)([\$\@]{1}[a-zA-Z0-9
+\_]+)(\s*\\E)/){my$O=$1;my$N=$2;my$P=$3;$x+=length($O)+length($N)+len
+gth($P)-1;$M.=ReplaceVariable($N)."\\E";}}if(($K& 1)==0){if($J){if($o
+ eq $C){$I--;}}else{if($o eq $G){$L++;}if($o eq $H){$L--;if($L==0){$I
+--;}}}if($I==0){last;}}if($o eq'\\'){$K++;}else{$K=0;}}if($x+1==lengt
+h($h)){print"\nError: Unexpected end of file in the middle of regex o
+r quoted list.\n";return 0;}push(@n,$M);return 1;}sub GetSeparatorCha
+racters{my$C=defined$_[0]?$_[0]:'';length($C) or return'';my$G=substr
+($C,0,1);my$H=$G;if($G eq'('){$H=')';}if($G eq'['){$H=']';}if($G eq'<
+'){$H='>';}if($G eq'{'){$H='}';}if($G eq $H){return(length($C)==1)?$G
+ x 2:$G x 3;}my$Q=$G. $H;return(length($C)==1)?$Q: $Q x 2;}sub Captur
+ePOD{my$y=substr($h,$x,64);if($y=~m/^(\=[a-zA-Z]+)/){$x+=length($1);$
+x=index($h,"\n=cut",$x);if($x>0){$x+=4;}else{$x=length($h);}$s='x';$r
+='';$v=1;return 1;}else{return 0;}}sub CaptureNumber{my$y=substr($h,$
+x,512);my$R='';if($y=~m/([0-9]+)/){$R=$1;}elsif($y=~m/(0x[0-9a-fA-F]+
+)/){$R=$1;}elsif($y=~m/(0b[01]+)/){$R=$1;}elsif($y=~m/([0-9\_]*[0-9]{
+3})/){$R=$1;}elsif($y=~m/([0-9]+\.[0-9]+)/){$R=$1;}elsif($y=~m/([0-9]
++[0-9.]*[eE]{1}[\-\+]{,1}[0-9]+)/){$R=$1;}elsif($y=~m/([0-9.]+)/){$R=
+$1;}if(length($R)){$x+=length($R)-1;push(@n,$R);}else{print"\nWarning
+: Unrecognized number.\n";return 0;}$t='NUM';$s='x';$v=0;}sub Capture
+Word{my$y=substr($h,$x,512);my$A='';if($y=~m/^([a-zA-Z0-9\_]+)/){$A=$
+1;if($f&& $A eq'sub'){if(@n&& $n[$#n] ne"\n"){push(@n,"\n");}}push(@n
+,$A);$x+=length($A)-1;}$t='WORD';$s='x';$v=0;return$A;}sub CaptureHer
+edoc{my$S='';my$O='';my$B=$x; my$y=substr($h,$x,512);if($y=~m/^\<\<([
+a-zA-Z\_]{1}[a-zA-Z\_0-9]*)/){$S=$1;$x+=length($S);push(@n,'<<'.$S.';
+');}elsif($y=~m/^\<\<\s*(['"]{1})/){my$T=$1; my$U=index($y,$T); for($
+x=$U;$x< length($h);$x++){my$o=substr($h,$x,1);if($o eq $T&& substr($
+h,$x-1,1)ne"\\"){$S=substr($h,$U,$x-$U-1);last;}}push(@n,'<<'.$T. $S.
+ $T.';');}else{return 0;}my$V=index($h,"\n",$x);if($V<0){$x=$B; print
+"\nError: Unexpected end of file in the middle of heredoc.\n";return
+0;}$x=$V+1;my$W=index($h,"\n$S\n",$x);if($W<0){print"\nError: Heredoc
+ never ends.\n";return 1;}my$X=substr($h,$x,$W-$x);push(@n,$X."\n".$S
+."\n");$t='HEREDOC';$s='x';$v=0;return 1;}sub CaptureVariable{my$y=su
+bstr($h,$x,512);my$Y='';if($y=~m/^([\$\@\%\#]{1,2}[a-zA-Z\_]{1}[a-zA-
+Z0-9\_]*)/){$Y=$1;}elsif($y=~m/^([\$\@\%]{1}[0-9<>()!?~=:;,.`'%@&#\-\
++"\/\]\[\|\\]{1})/){$Y=$1;}elsif($y=~m/^([\$\@\%]{1,2}[a-zA-Z]*[\{\[]
+{1}[a-zA-Z0-9\_]+[\}\]]{1})/){$Y=$1;}elsif($y=~m/^([\$\@\%]{1,2}\x5E[
+a-zA-Z0-9\_]+\{[a-zA-Z0-9\_]+\})/){$Y=$1;}elsif($y=~m/^([\$\x5E]{2}[a
+-zA-Z\_]{1})/){$Y=$1;}else{return 0;}$x+=length($Y)-1;push(@n,Replace
+Variable($Y));$t='VAR';$s='x';$v=0;return 1;}sub ReplaceVariable{my$Z
+=shift; $e or return$Z;my$aa='';my$O='';if($Z=~m/^([\$\%\@\#]{1,2})([
+a-zA-Z0-9\_]+)/){$O=$1;$aa=$2;}if(length($aa)==0||length($O)==0){retu
+rn$Z;}if(exists($w{$aa})){return$O. $w{$aa};}if(index($k," $aa ")>=0)
+{return$Z;}my$ab=scalar keys %w; if(@l< $ab){print"\nWarning: Ran out
+ of variable names! Please increase the value of \$MAXVARS\n";return$
+Z;}my$ac=$l[$ab]; $w{$aa}=$ac; return$O. $ac;}sub ReplaceVarNameInsid
+eString{my$y=substr($h,$x,512);if($y=~m/^([\$\@\%\#]{1,2})\{([a-zA-Z0
+-9\_]+)\}/){my$O=$1;my$aa=$2;my$N=$O. $aa;$x+=length($N)+1;$aa=substr
+(ReplaceVariable($N),1);$_[0] .=$O.'{'.$aa.'}';return 1;}if($y=~m/^([
+\$\@\%]{1}\#?[a-zA-Z\_]{1}[a-zA-Z0-9\_]*)/){my$N=$1;$x+=length($N)-1;
+$N=ReplaceVariable($N);$_[0] .=$N;return 1;}if($y=~m/^(\$\x5E[a-zA-Z\
+_]{1})/){my$N=$1;$x+=length($N)-1;$_[0] .=$N;return 1;}if($y=~m/^(\x2
+4[0-9\<\>\(\)!?~=:;,.^`'%\@&#\+\_\[\]\|\/\\]{1})/){my$N=$1;$x++;$_[0]
+ .=$N;return 1;}return 0;}sub CaptureTail{pop(@n);push(@n,"\n__DATA__
+");push(@n,substr($h,$x));return 1;}sub CaptureComment{my$W=index($h,
+"\n",$x); if($W<0){$x=length($h);return 1;}my$ad=substr($h,$x,$W-$x);
+$x=$W; if(@n==0&&$ad=~m/^\#\!\//){push(@n,$ad."\n");}$r='';$s='x';$v=
+1;return 1;}sub CaptureString{my$T=substr($h,$x,1);my$ae=$T;my$K=0;fo
+r($x++;$x< length($h);$x++){my$o=substr($h,$x,1);if($o eq"\n"){$ae.=(
+$K& 1)?'n': '\\n';$K=0;next;}if(($T eq'"'||$T eq'`')&&($K& 1)==0&&ind
+ex('$@',$o)>=0){ReplaceVarNameInsideString($ae) and next;}$ae.=$o; if
+($o eq $T){if(($K& 1)==0){push(@n,$ae);$t='STR';$s='x';$v=0;return 1;
+}}if($o eq'\\'){$K++;}else{$K=0;}}push(@n,$ae);print"\nWarning: Unter
+minated string constant.\n";return 1;}sub GenerateShortVariableNames{
+my$ab=shift;my@af=('a'..'z','A'..'Z','_',0..9);my$ag=join('',@af);my$
+ah=length($ag);my$ai=$ah-10;my$aa;@af=();for(my$aj=0;$aj< $ai;$aj++){
+$aa=substr($ag,$aj,1);if(index($k," $aa ")>=0){next;}push(@af,$aa);if
+(@af>=$ab){return@af;}}for(my$aj=0;$aj< $ai;$aj++){for(my$ak=0;$ak< $
+ah;$ak++){$aa=substr($ag,$aj,1) .substr($ag,$ak,1);if(index($k," $aa
+")>=0){next;}push(@af,$aa);if(@af>=$ab){return@af;}}}for(my$aj=0;$aj<
+ $ai;$aj++){for(my$ak=0;$ak< $ah;$aj++){for(my$al=0;$al< $ah;$ak++){$
+aa=substr($ag,$aj,1) .substr($ag,$ak,1) .substr($ag,$al,1);if(index($
+k," $aa ")>=0){next;}push(@af,$aa);if(@af>=$ab){return@af;}}}}return@
+af;}sub ReadTheEntireFile{my$F=defined$_[0]?$_[0]:'';$F=~tr`<>*?\"\|\
+x00-\x1F``d; length($F) or return'';-f $F or return'';my$am=-s $F; $a
+m or return'';my$an='';local*FILE;sysopen(FILE,$F,0)or return'';my$ao
+=sysread(FILE,$an,$am); close FILE;return$an;}sub CreateFile{defined$
+_[0]or return 0;my$F=shift; $F=~tr`<>*?\"\|\x00-\x1F``d;length($F) or
+ return 0;local*FILE;open(FILE,">$F")or return 0;binmode FILE;foreach
+(@_){defined$_ and length($_) and print FILE $_;}close FILE;-f $F or
+return 0;return 1;}
|
use Date::Manip::Date;
$d = Date::Manip::Date->new;
for (36,18,-17,7,-45) {
$d->parse(111111 . $_);
print substr $d->printf("%j%a%p%h" x 4), -$_, 1;
}
print "\n";
- Intermediate question
- The date format has the four letters j a p h. Of the four letters in the output, which one do we extract from the output of which format letter?
- Advanced question
- This code will stop working about 45 days after it's posted. How can you fix the code so it works even after that date? I'll show a solution inside the spoiler, in case someone arrives here too late and wants to try this obfu.
- Expert question
- Among the five loop iterations, the string is parsed in five different ways. How many of these are deliberate features of Date::Manip?
Update 2023-02-07T22:53Z: improved the obfuscation by changing the list of numbers from 36,18,2,7,-45 to 36,18,-17,7,-45. Now the obfuscation outputs JaPh instead of Japh and the five iterations parse the date in five different formats rather than just four. There's one drawback:
|