http://qs1969.pair.com?node_id=11109834


in reply to Perl script compressor

I am not even sure what kind of complicated code patterns may exist in a perl script.

Classic example of "only perl can parse Perl" (based heavily on tye's example here):

BEGIN { eval( rand>0.5 ? 'sub foo () {}' : 'sub foo {}' ) } foo / # /; 42;

That's perfectly valid Perl code even under strict. Run that though B::Deparse a few times, and sometimes you'll get

foo(/ # /); '???';

and sometimes

&foo() / 42;

In other words, sometimes it's a function call to which the result of a regex is being passed followed by a constant (that gets optimized away), and sometimes it's the return value of a function call, followed by a division operator, a comment, and the divisor. Which it is depends entirely on a random number that changes from run to run, and the result can't be known until the perl binary executes the first line of Perl code; doing the same with a static parse (without executing code) is impossible.

I need to come up with a logic that can interpret a line of complex perl code and remove comments and spaces in such a way that it won't break the code.

Perl Cannot Be Parsed: A Formal Proof

The closest thing to a static Perl parser is PPI (and perhaps PPR; Update 2: and new: Guacamole), and marto already pointed you to one module based on it; see also Perl::Squish. Update: You could also look at Perl::Tidy. Update 3: I think RPerl also implements its own parser.