in reply to Re: XS: Manipulating refcounts
in thread XS: Manipulating refcounts

Whatever your level with experience with C/XS I think you've tracked down the critical bug. :) More on that in a bit.

You're absolutely right that this data structure should use SVs and not C structs. The very idea of using C structs when an ordinary hash would do is ridiculous! But then it wouldn't illustrate the problem I'm trying to solve.

This technique is being developed for use with Apache Lucy, which will be a large C library with both Perl and Ruby bindings. The "object" member of the struct will actually be a void*, and it could be a Perl object, or a Ruby object, or eventually, something else. (Actually, I don't know exactly whether or how it will work with Ruby -- hopefully it will. :) ) All the Perl-specific C routines need to stay in the XS bindings and cannot be used in the C core. Dereferencing an actual SV and getting at its contents is not an option. However, we need some way of working with native garbage collection, and this is it.

With regards to whether or not storing a pointer is dangerous... it's common practice, even in core modules. You can make DB_File segfault by decrementing the pointer in the tied object! So don't do that. :)

#!/usr/bin/perl use strict; use warnings; use DB_File; my %hash; my $tied = tie %hash, 'DB_File', "garbagefile"; $hash{foo} = "bar"; print STDERR $tied->FETCH('foo') . "\n"; $$tied--; print STDERR $tied->FETCH('foo') . "\n"; __END__ Outputs: slothbear:~/perltest marvin$ perl ptrobj_segfault.plx bar Bus error slothbear:~/perltest marvin$
--
Marvin Humphrey
Rectangular Research ― http://www.rectangular.com

Replies are listed 'Best First'.
Re^3: XS: Manipulating refcounts
by Animator (Hermit) on Sep 20, 2006 at 18:21 UTC

    Ofcourse there are (CORE) modules doing it... But it's not that because they do it you should do it too... And in my opinion the tied hash from DB_File and your scalar are two completly differnt things...

    The way to make DB_File segfault is to get the tied object and then change it.

    The way to make your code segfault is to either change the thing it references to or pass the wrong thing to one of your other routines.

    For example:

    my $artist2; my $cd = new_cd($artist2);
    This code will segfault. Sure in this example it is (or atleast should be) obvious why it segfaults... But if you are passing $artist to several subroutines and maybe forgetting it/passing the wrong thing once then it is not that obvious...

    To prevent $$artist from being changed you could use: SvREADONLY_on(SvRV(perlref));.

    But this still leaves the other issue... As in, passing the wrong variable/undef to new_cd(); (for example)...

    I guess it comes down to:

    • Who will be using it?
    • How familiar are they with segfaults?
    • Do they expect to see a segfault in perl?
    • Will they think about your code as causing the segfault? (I guess that comes down to: how well is it documented)
    • Do you really care about the memory that an extra list would use?

    Update: removed accidental link.

      The standard way of throwing a meaningful error rather than segfaulting when supplied with bad input of the type you describe is to vet arguments using sv_derived_from, which is the perlapi function which implements isa. Manually inserted, such argument checking would look something like this...
      /* Constructor for CD. */ SV* new_cd(SV *artist_sv) { if (!sv_derived_from(artist_sv, "Artist") { Croak("Not an Artist"); } Artist *artist = INT2PTR( Artist*, SvIV(SvRV(artist_sv)) ); /* ... */ }

      That stops anything but Artist and its subclasses from getting through. If somebody does something like bless a hash into the "Artist" package and submit that we'll still get a segfault, but that's less likely to happen inadvertently.

      Typically, the argument-checking code is not typed in manually, but is inserted by xsubpp via a typemap which spares the programmer from the error-prone drudgery of repeating that code over and over.

      Rest assured that my production code always implements such checks. I agree that the example code is dangerous. I left out the safety code because the sample was already too long -- long enough to dissuade demerphq from looking at it, for example.

      --
      Marvin Humphrey
      Rectangular Research ― http://www.rectangular.com