rutgeraldo has asked for the wisdom of the Perl Monks concerning the following question:

Dear monks,

I am trying to construct self-referential objects (tree/DAG nodes) using Inline::C. These are the requirements:

I am hampered by the following challenges: Nonetheless, playing around with the Inline::C cookbook and the results of various google searches is getting me started with just enough knowledge to be dangerous. Below is the simplest case I came up with just for child->parent relationships:

#!/usr/bin/perl package Node; use strict; use warnings; use Inline C => <<'__EOI__'; // XXX no children for now typedef struct Node { struct Node* parent; SV* sv; } Node; Node* new(const char * classname) { Node *self; // allocate and initialize struct Newx(self, 1, Node); self->parent = NULL; // create perl object and ref SV* perlref = newSViv((IV)self); SV* obj_ref = newRV_noinc(perlref); sv_bless(obj_ref, gv_stashpv(classname, TRUE)); SvREADONLY_on(perlref); // store pointer to object in struct self->sv = obj_ref; return self; } // should NOT alter refcount Node* set_parent(Node* self, Node* parent) { self->parent = parent; return self; } Node* get_parent(Node* self) { return self->parent; } // XXX get/set children deferred void destroy_node(Node* self) { Safefree(self); } __EOI__ sub DESTROY { my $self = shift; warn "destroying $self"; $self->destroy_node; } package main; use Devel::Peek; $|++; my $node = Node->new; my $parent = Node->new; print "NODE: "; Dump($node); print "---\n"; print "PARENT: "; Dump($parent); print "---\n"; $node->set_parent($parent); print "FROM FIELD: "; print Dump($node->get_parent); print "---\n";

Here is my typemap:

TYPEMAP Node * NODE INPUT NODE $var = ($type)SvIV(SvRV($arg)); OUTPUT NODE $arg = $var->sv;

When I run this, the following output is produced:

NODE: SV = IV(0x7fcc7b8041b8) at 0x7fcc7b8041c8 REFCNT = 1 FLAGS = (PADMY,ROK) RV = 0x7fcc79803e98 SV = PVMG(0x7fcc790023b0) at 0x7fcc79803e98 REFCNT = 1 FLAGS = (OBJECT,IOK,READONLY,pIOK) IV = 140516178002128 NV = 0 PV = 0 STASH = 0x7fcc7a004e70 "Node" --- PARENT: SV = IV(0x7fcc7a008838) at 0x7fcc7a008848 REFCNT = 1 FLAGS = (PADMY,ROK) RV = 0x7fcc79803ce8 SV = PVMG(0x7fcc79002770) at 0x7fcc79803ce8 REFCNT = 1 FLAGS = (OBJECT,IOK,READONLY,pIOK) IV = 140516178002208 NV = 0 PV = 0 STASH = 0x7fcc7a004e70 "Node" --- FROM FIELD: SV = UNKNOWN(0xff) (0x7fcc7a004630) at 0x7fcc79803fd0 REFCNT = 0 FLAGS = (TEMP) Attempt to free unreferenced scalar: SV 0x7fcc79803fd0, Perl interpret +er: 0x7fcc79800000 at ref.pl line 72. --- destroying Node=SCALAR(0x7ff391803ce8) at ref.pl line 51. destroying Node=SCALAR(0x7ff391803e98) at ref.pl line 51.

My conclusions are as follows:

  1. Object instantiation works more or less as I hoped. The typemap conversion gives me back the blessed "obj_ref" from the constructor, which in turn has the reference count from "perlref", which is created from casting the pointer to Node* as an IV integer value. (Correct?).
  2. Method invocation works as hoped. The getters and setters are found, the invocant is converted with the macros SvIV(SvRV()) to get the reference, then its integer value, which we then cast as Node* pointer. (Correct?).
  3. The additional method argument to set_parent is also converted correctly, using the same typemapping as the invocant.
  4. However, getting our parent back results in perl not knowing what to do with the parent Node*. Some sort of unknown, moribund thing comes back. This thing throws up the "Attempt to free unreferenced scalar..." warning. This part I really don't understand and I could use some help/explanation.
  5. Otherwise, the intended destructor for nodes is called as intended. It doesn't give any warnings, but whether that means I've done enough to clean up after myself, I don't know. Any advice would be greatly appreciated.. I imagine that once children come in the mix I would increment their refcount in the setter, and decrement this in the destructor - which might trigger their destruction recursively.

Apologies for the long "first" post. I used to be user "rvosa" but forgot my password and lost my email address. Thanks!

Replies are listed 'Best First'.
Re: Inline::C self-referential struct idioms and memory
by BrowserUk (Patriarch) on Sep 04, 2015 at 14:18 UTC

    Here's how I do that kind of thing.

    The source code (Node.pl):

    #!/usr/bin/perl package Node; use strict; use warnings; use Inline C => Config => BUILD_NOISY => 1; use Inline C => <<'__EOI__', NAME => 'Node', CLEAN_AFTER_BUILD =>0, T +YPEMAPS => 'Node.typemap'; #define CLASS "Node" // XXX no children for now typedef struct Node { struct Node* parent; SV* sv; } Node; Node *new( const char *classname ) { Node *t; Newx( t, 1, Node ); t->sv = newSViv( 0 ); return t; } // should NOT alter refcount Node* set_parent( Node* self, Node* parent ) { self->parent = parent; return self; } Node* get_parent(Node* self) { return self->parent; } // XXX get/set children deferred void destroy_node(Node* self) { Safefree(self); } __EOI__ sub DESTROY { my $self = shift; warn "destroying $self"; $self->destroy_node; } package main; use Devel::Peek; $|++; my $node = Node->new(); my $parent = Node->new(); print "NODE: "; Dump($node); print "---\n"; print "PARENT: "; Dump($parent); print "---\n"; $node->set_parent($parent); print "FROM FIELD: "; print Dump($node->get_parent); print "---\n";

    The typemap (Node.typemap):

    TYPEMAP const char * T_PV Node * O_OBJECT U64 T_UV U8 T_UV U8 * T_PV INPUT O_OBJECT if( sv_isobject($arg) && ( SvTYPE( SvRV($arg) ) == SVt_PVMG ) ) $var = INT2PTR( $type, SvIV( (SV*)SvRV( $arg ) ) ); else{ warn( \"${Package}::$func_name() -- $var is not a blessed +SV reference\" ); XSRETURN_UNDEF; } OUTPUT # The Perl object is blessed into 'CLASS', which should be a # char* having the name of the package for the blessing. O_OBJECT sv_setref_pv( $arg, (char *)CLASS, (void*)$var );

    The output from a run:

    C:\test>Node.pl NODE: SV = RV(0x35f17f0) at 0x35f17e0 REFCNT = 1 FLAGS = (PADMY,ROK) RV = 0x19f070 SV = PVMG(0x35e6308) at 0x19f070 REFCNT = 1 FLAGS = (OBJECT,IOK,pIOK) IV = 58154984 NV = 0 PV = 0 STASH = 0x1f6f48 "Node" --- PARENT: SV = RV(0x35f1718) at 0x35f1708 REFCNT = 1 FLAGS = (PADMY,ROK) RV = 0x3787a00 SV = PVMG(0x35e6368) at 0x3787a00 REFCNT = 1 FLAGS = (OBJECT,IOK,pIOK) IV = 58163768 NV = 0 PV = 0 STASH = 0x1f6f48 "Node" --- destroying Node=SCALAR(0x19ed58) at C:\test\Node.pl line 41. FROM FIELD: SV = RV(0x19f098) at 0x19f088 REFCNT = 1 FLAGS = (TEMP,ROK) RV = 0x19ed58 SV = PVMG(0x35e6278) at 0x19ed58 REFCNT = 1 FLAGS = (OBJECT,IOK,pIOK) IV = 58163768 NV = 0 PV = 0 STASH = 0x1f6f48 "Node" destroying Node=SCALAR(0x19ed58) at C:\test\Node.pl line 41. --- destroying Node=SCALAR(0x3787a00) at C:\test\Node.pl line 41.

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
    In the absence of evidence, opinion is indistinguishable from prejudice.
    I'm with torvalds on this Agile (and TDD) debunked I told'em LLVM was the way to go. But did they listen!

      Great, thanks! I really like what that constructor has turned into. Lessons learned:

      • You can 'use Inline C => ...' multiple times. I've always wondered about how to pass all this different stuff into it. Like this, apparently.
      • The '#define CLASS "Node"' is picked up in the typemap (right?). So this should also work for definitions in separate source and header files, presumably.
      • For the output in the typemap I guess we do have to make a new SV* every time, and bless it? So we can't just carry the "original" one around in our struct? I wonder how expensive doing this every time is. I'm sure it's not dramatic but I think I'll benchmark this. I assume my not doing sv_setref_pv was behind those "bizarre copy" warnings, correct?
      • For the input in the typemap, can I just skip the checks and jump straight to INT2PTR? I assume disasters will happen if users pass in the wrong argument - but they just shouldn't do that.
      Anyway, thanks - great answer! What do you advise in terms of managing child nodes in a container? These trees are not binary so I need a container member in the Node struct that can grow and shrink. Is it madness to try to come up something by myself and should I just use an AV*? If I set children, based on my logic of recursive cleanup from root to tips, am I correct to think I can SvREFCNT_inc in the setter and decrement all the children in the destructor, then clear out the container?

        • The '#define CLASS "Node"' is picked up in the typemap (right?). So this should also work for definitions in separate source and header files, presumably.

          Yes. In theory, you could add the O_OBJECT input and output mappings into the default typemap file; and then all you'd need in the typemap for any given XS class is the name typedef:

          Node * O_OBJECT

          I did try it once but couldn't quite get it to work and I've stuck with this typemap ever since.

        • For the output in the typemap I guess we do have to make a new SV* every time, and bless it?

          Remember what is being created is just another reference (RV) not the object itself.

          I cannot say with 100% certainty that it is the only way to do it; just that it works.

          I suspect it is necessary because what you pass back will get assigned to a Perl variable; which will go out of scope and get cleaned up.

          You might try to avoid that by managing the reference count of the reference to the object; but its not the way perl usually does things.

        • I assume my not doing sv_setref_pv was behind those "bizarre copy" warnings, correct?

          Probably, but once again, I cannot say for sure. Like most XS programmers; a large part of the code I use is essentially cargo-culted.

          You find something that works and stick with it because the alternative is an endless cycle of trial and error.

        • For the input in the typemap, can I just skip the checks and jump straight to INT2PTR?

          Probably. But I think that conditional check costs so little; and the information it give when someone does pass the wrong thing is so useful; I would only do so if I was really desperate for performance.

          And then only if removing it actually made a significant difference to that performance; which I doubt.

        • Is it madness to try to come up something by myself and should I just use an AV*?

          The answer to that really depends on how dynamic the children are; but AVs are pretty cheap and very flexible. I'd only avoid them if I could substitute a fixed-size C array and know it would cater for all circumstances.

        • If I set children, based on my logic of recursive cleanup from root to tips, am I correct to think I can SvREFCNT_inc in the setter and decrement all the children in the destructor, then clear out the container?

          I can't answer that. I'd have to try it out and see what works.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
        In the absence of evidence, opinion is indistinguishable from prejudice.
        I'm with torvalds on this Agile (and TDD) debunked I told'em LLVM was the way to go. But did they listen!
Re: Inline::C self-referential struct idioms and memory
by BrowserUk (Patriarch) on Sep 04, 2015 at 13:00 UTC

    How are you informing Inline that you wish to use your typemap when it builds the code?

    When I use them I do it with this: TYPEMAPS => 'GetContext.typemap' on the use Inline C line:

    use Inline C => <<'END_C', NAME => 'Trie', CLEAN_AFTER_BUILD =>0, TYP +EMAPS => '/test/Trie.typemap';;

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
    In the absence of evidence, opinion is indistinguishable from prejudice.
    I'm with torvalds on this Agile (and TDD) debunked I told'em LLVM was the way to go. But did they listen!
      I have a file named 'typemap' in the folder where I issue the command to run the example script. Inline::C figures it out automagically.
Re: Inline::C self-referential struct idioms and memory
by syphilis (Archbishop) on Sep 08, 2015 at 02:38 UTC
    // should NOT alter refcount Node* set_parent(Node* self, Node* parent) { self->parent = parent; return self; }

    There are memory management issues here. Why does this function return "self" ?
    IOW, why is it not written as:
    void set_parent(Node* self, Node* parent) { self->parent = parent; }
    When it returns "self", it apparently decrements the reference count of "self". Here's a supporting demo:
    #!perl -l use warnings; no warnings 'once'; use Inline C => Config => BUILD_NOISY => 1; use Inline C => <<'EOC'; SV * foo(SV * in) { return in; } SV * get_ref(SV * in) { return newSVuv(SvREFCNT(in)); } EOC $x = 42; # $x refcnt == 1 %h = (fu => \$x); # $x refcnt == 2 print "x refcnt: ", get_ref($x); foo($x); # decrement $x refcnt print "x refcnt: ", get_ref($x); $y = foo($x); # decrement $x refcnt print "x refcnt: ", get_ref($x); print "y refcnt: ", get_ref($y); print "x: $x"; print "y: $y";
    Which outputs:
    x refcnt: 2 x refcnt: 1 x refcnt: 0 y refcnt: 1 Use of uninitialized value $x in concatenation (.) or string at try.pl + line 31. x: y: 42
    I note that the return of set_parent() is not being captured, which reinforces my view that it could be rewritten as I've suggested.
    But if it really does need to return "self" && the refcnt is not to be altered (as the comment specifies), then I think it's necessary to increment the refcnt of "self" before returning.

    Either way worked fine for me.
    However, for the same reason, I had to rewrite get_parent() as:
    Node* get_parent(Node* self) { SvREFCNT_inc(self->parent); return self->parent; }
    After that, it all seems to work fine for me (using your script and your typemap):
    NODE: SV = IV(0x2c22740) at 0x2c22744 REFCNT = 1 FLAGS = (PADMY,ROK) RV = 0x45812c SV = PVMG(0x2a15cac) at 0x45812c REFCNT = 1 FLAGS = (OBJECT,IOK,READONLY,pIOK) IV = 40118476 NV = 0 PV = 0 STASH = 0x1d0edb4 "Node" --- PARENT: SV = IV(0x2c227f0) at 0x2c227f4 REFCNT = 1 FLAGS = (PADMY,ROK) RV = 0x45807c SV = PVMG(0x2a155ac) at 0x45807c REFCNT = 1 FLAGS = (OBJECT,IOK,READONLY,pIOK) IV = 40119340 NV = 0 PV = 0 STASH = 0x1d0edb4 "Node" --- FROM FIELD: SV = NULL(0x1d0e7) at 0x45817d REFCNT = -16777216 FLAGS = (TEMP,BREAK,OVERLOAD,EVALED,IsUV) --- destroying Node=SCALAR(0x45807c) at rut.pl line 58. destroying Node=SCALAR(0x45812c) at rut.pl line 58.
    One other thing - I changed destroy_node() to:
    void destroy_node(Node* self) { Safefree(self->parent); Safefree(self); }
    That didn't blow anything up, from which I deduce that self->parent does need to be Safefree'd in order to avoid memory leaks.
    OTOH, attempting to Safefree(self->sv) did crash the script - and I therefore deduce that doing so is not the right thing ;-)

    Update: Whilst the above holds for my perls 5.10.0, 5.12.0, 5.14.0, 5.16.0, 5.18.0 and 5.20.0, on perl-5.22.0 the Safefree(self->parent) leads to a crash for me (on cleanup after the 2 calls to destroy_node have successfully completed) - so there's still a bit more to be unravelled.

    Cheers,
    Rob