in reply to Re: Test::Code
in thread Test::Code

PPI::Normal has the intention, in the long run, of normalizing Perl in such a way that functionally equivalent code will present the same DOM tree. Originally PPI::Normal was close to that, but further development of PPI has scaled it back somewhat.

As for your question about B::Deparse, currently I'm using it like this:

my $deparse = B::Deparse->new( "-p", # add extra parentheses "-q", # expand double-quoted strings "-sC", # cuddle else/elsif/continue blocks "-x3", # expand syntax constructs );

Unfortunately, that doesn't handle the case of variables having different names, even if the code is functionally equivalent. Your LISP solution sounds interesting but I wonder how I would present the got/expected failure information? Not too many people are going to want to look at at a LISP equivalent.

Cheers,
Ovid

New address of my CGI Course.

Replies are listed 'Best First'.
Re^3: Test::Code
by thor (Priest) on Aug 12, 2005 at 11:28 UTC
    PPI::Normal has the intention, in the long run, of normalizing Perl in such a way that functionally equivalent code will present the same DOM tree.
    Okay...maybe I'm missing something, but isn't that nigh impossible? Consider these two very simple functions:
    sub alpha { foreach(1..10) { print; } } sub beta { for(my $i=10; $i>=1; $i--) { print 11 - $i; } }
    Both do the exact same thing, albeit in different ways. I'd be interested to see some sort of automated solution that reduces both to the same thing at any level.

    thor

    Feel the white light, the light within
    Be your own disciple, fan the sparks of will
    For all of us waiting, your kingdom will come

Re^3: Test::Code
by diotalevi (Canon) on Aug 12, 2005 at 13:07 UTC

    I have a solution for you but I haven't finished it yet. I wrote Data::Postponed so I could abstract off the symbol renaming part of B::Deobfuscate into something else. The interesting effect of that is you'd end up with a tree of perl syntax with placeholders wherever a symbol name went. Normally you'd just let the values interpolate in but if you wished, you could change the values or just dump out the intermediate structure.

    Consider this. Its how I remember stuff is dumped just from simple debugging. Obviously a more convenient debug output could be provided.

    (. (. (. (. "sub " SUBNAME) " {\n    print ") FOO ) ";\n}")
Re^3: Test::Code
by xdg (Monsignor) on Aug 12, 2005 at 12:27 UTC

    For the variable names, what about trying something similar to B::Deobfuscate? Take the deparsed code and walk through, renaming each variable in turn consistently but abstractly. E.g. if the first variable encountered is "$foo", replace all "[$@%&*]foo" with "${1}var1" everywhere. In otherwords, wherever "foo" is used as a symbol to refer to a variable, replace it with something predictable. So even if the other piece of code uses "bar" instead of foo, as long as the symbol exists in the same semantic place in the deparsed code, it will get replaced similarly by "var1".

    My mind boggles at the regex challenge of doing this sanely on Perl code, so the rest of the solution is left as an exercise for the reader.

    The other thing that occurs is not bothering with B::Deparse but going straight back to B and comparing at the op tree directly. Simon Cozen's has some examples of walking the op tree in Advanced Perl Programming (2nd ed). E.g. (almost straight from the text):

    use B; my $subref = sub { # some subroutine } my $b = B::svref_2object( $subref ); my $op = $b->START; do { print B::class($op) . " : " . $op->name . " (" . $op->desc . ")\n"; } while $op = $op->next and not $op->isa("B::NULL");

    B:: is way over my head, but the notion of walking the two trees and comparing operations directly seems like it might make it easier than worrying about interpreting the deparsed version of the same thing. (Let perl parse Perl.)

    -xdg

    Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

      That's easy. Deobfuscate both snippets of code and feed each the same dictionary. Its a deterministic process so you'll get the same result from equivalent code regardless of the symbol names.