in reply to RFC: Location via B::Deparse

Since no comments yet, here's an update on where I have gotten so far.

Consider this code using B::Concise to disassemble a function and B::Deparse to reconstruct the code into Perl:

use B::Deparse; use B::Concise qw(set_style); sub foo() { my $x=1; $x+=1; $x *=2; my $z = 0; } my $walker = B::Concise::compile('-basic', 'foo', \&foo); B::Concise::set_style_standard('debug'); B::Concise::walk_output(\my $buf); $walker->(); # walks and renders into $buf; print($buf); my $deparse = B::Deparse->new("-p", "-sC"); foo(); $body = $deparse->coderef2text(\&foo); print $body, "\n";

When I debug step through function foo via the Trepan debugger, the addresses do match addresses in the assembly output. However when I step through a deparse I am getting different addresses for the instructions. So matching instructions seen inside the debugger with those seen in deparse seems hopeless. Here is an abbreviated log:

Command: trepan.pl /tmp/foo.pl
-- main::(/tmp/foo.pl:8 @0x161d800)
my $walker = B::Concise::compile('-basic', 'foo', \&foo);
(trepanpl): continue 15
main::foo:
UNOP (0x3a89838)
	op_next		0
	op_sibling	0
	op_ppaddr	PL_ppaddrOP_LEAVESUB
	op_type		175
	op_flags	4
	op_private	65
	op_first	0x161daa0
LISTOP (0x161daa0)
	op_next		0x3a89838
	op_sibling	0
	op_ppaddr	PL_ppaddrOP_LINESEQ
	op_type		181
	op_flags	12
	op_private	0
	op_first	0x161dae8
	op_last		0x3a898e0
COP (0x161dae8)
	op_next		0x161db90
	op_sibling	0x161db48
	op_ppaddr	PL_ppaddrOP_DBSTATE
	op_type		183
	op_flags	1
	op_private	0	0
BINOP (0x161db48)
...
x1 main::(/tmp/foo.pl:15 @0x5b2a818)
$body = $deparse->coderef2text(\&foo);
(trepanpl): s
-- B::Deparse::(/usr/share/perl/5.18/B/Deparse.pm:820 @0x3c8d8d8)
    my $self = shift;
(trepanpl): b deparse_sub
Breakpoint 2 set in /usr/share/perl/5.18.2/B/Deparse.pm at line 985
(trepanpl): continue
-> B::Deparse::(/usr/share/perl/5.18/B/Deparse.pm:985 @0x3c92b18)
sub deparse_sub {
    my $self = shift;
trepanpl: next 3
-- B::Deparse::(/usr/share/perl/5.18/B/Deparse.pm:988 @0x3cb0f68)
    my $proto = "";
(trepanpl): $cv
$DB::D[0] = B::CV=SCALAR(0x5c11da8)
(trepanpl): $cv->ROOT
$DB::D1 = B::UNOP=SCALAR(0x5d30648)
(trepanpl): $cv->ROOT->first
$DB::D2 = B::LISTOP=SCALAR(0x5d3f9c8)
(trepanpl): $cv->ROOT->first->name
$DB::D3 = lineseq
(trepanpl):

Note that the addresses 0x5c11da8 just don't match the disassembly. However When I run printf "%x\n", \&foo that does give me an address that does match values.

Replies are listed 'Best First'.
Re^2: RFC: Location via B::Deparse
by rockyb (Scribe) on Oct 28, 2015 at 03:34 UTC

    Note: this has been heavily edited.

    With help from stackoverflow, I have a way to get started.

    use B::Deparse; use B::Concise qw(set_style); sub foo() { my $x=1; $x+=1; my $z=0; } my $deparse = B::Deparse->new("-p", "-l", "-sC"); $body = $deparse->coderef2text(\&foo); print($body, "\n"); my $walker = B::Concise::compile('-basic', 'foo', \&foo); B::Concise::set_style_standard('debug'); B::Concise::walk_output(\my $buf); $walker->(); # walks and renders into $buf; print($buf); package B::Deparse; sub lineseq { my($self, $root, $cx, @ops) = @_; my($expr, @exprs); my $out_cop = $self->{'curcop'}; my $out_seq = defined($out_cop) ? $out_cop->cop_seq : undef; my $limit_seq; if (defined $root) { $limit_seq = $out_seq; my $nseq; $nseq = $self->find_scope_st($root->sibling) if ${$root->sibling}; $limit_seq = $nseq if !defined($limit_seq) or defined($nseq) && $nseq < $limit_seq; } $limit_seq = $self->{'limit_seq'} if defined($self->{'limit_seq'}) && (!defined($limit_seq) || $self->{'limit_seq'} < $limit_seq); local $self->{'limit_seq'} = $limit_seq; my $fn = sub { my ($text, $i) = @_; my $op = $ops[$i]; push @exprs, sprintf("# op: 0x%x\n%s ", $$op, $text); }; $self->walk_lineseq($root, \@ops, $fn); # $self->walk_lineseq($root, \@ops, # sub { push @exprs, $_[0]} ); my $sep = $cx ? '; ' : ";\n"; my $body = join($sep, grep {length} @exprs); my $subs = ""; if (defined $root && defined $limit_seq && !$self->{'in_format'}) { $subs = join "\n", $self->seq_subs($limit_seq); } return join($sep, grep {length} $body, $subs); }

      Devel::Trepan 0.70 has just been released. In it, the deparse command will use the current location to deparse. This does handle multiple statements on a single line, inside a subroutine. (I haven't been able to figure out how to get this to work in the main program.)

      Just to get this done, I needed to monkeypatch B::Deparse and add some other routines that probably should in another package somewhere.

      In order to be able to figure out the exact location in a callsite, e.g.

      fib($x-1) + fib($x-2)
      addresses in showing the call stack and are recorded in the stack in the 0.70 release. I beefed up Devel::Trepan::Disassemble and made a release of that too. The basic idea here is you disassemble the surrounding context to reconstruct the fragment of code. (The call location is marked with an arrow automatically if that is found.)

      Better though would be more changes to B::Deparse so that you can pass it an opcode location and it deparse around that. Also, I'd like to see more COP instructions added say found for loops of the 3-part kind: initialize, test and increment.

      As with other science and engineering, the more you do, the more you realize there is to do. I may beef up B::CodeLines so that it notes which lines have multiple statements in them.

      Finally, I observe that providing really good debugging support, not just in Perl but other languages as well, is often really hard and strays way outside of the "debugger" proper. I think that's one reason you don't find things like this often in debuggers.

        An update on where things stand now...

        Github now has modified code to save deparsed text fragments and more of an abstract runtime tree structure. I've moved the "deparse" command in Devel::Trepan out, since this now uses this cooler code, B::Deparsetree, which is not part of the core Perl distribution, and only works right now for versions 5.20 and 5.22.

        At some point I'll package these for CPAN and make a new release of Devel::Trepan. There still are lots of bugs and improvements that could be made in in B::DeparseTree: B::Deparse which it is based on has thousands of tests, and I tried merely a single thousand or so of these. And B::Deparse with these still has bugs based on trying run it against the core Perl tests.

        This was an incredible time sink. But I think it really cool. Things like Carp::Confess and callstack routines could be beefed up to give more information about the source at a failure.