Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re^7: Inline::C and NULL pointers

by markong (Pilgrim)
on Dec 20, 2021 at 23:06 UTC ( [id://11139779]=note: print w/replies, xml ) Need Help??


in reply to Re^6: Inline::C and NULL pointers
in thread Inline::C and NULL pointers

Well ... Inline::C is, in the final analysis, nothing other than XS.

Of course it sits on top of XS, but so does any other tool that has to reach out to C/C++ (and for this matter any other language e.g.: Java's JNI). SWIG is a case in point which does that for a plethora of languages: but it's not just XS, or JNI, or Python's C extensions: it's all those things packaged in a coherent way with extensive documentation attached; which is really all I want when I am doing Perl, because if I am doing Perl I want to stay away from XS/C as much as possible.
The same thing with a compiler tool-chain: sits on top of many tools and low level languages, but it's not just assembly or machine code.

So, we are talking about two different levels of abstraction: tools like Inline::C which conveniently hide the low level details and sits in the middle, should hopefully provide support for the most common use cases and provide extension points for the uncommon cases.
If we agree that NULL pointers are a common feature of the C libraries, than I would expect transparent support for those by default.
I'm pretty surprised that this never came up before given the widespread use of NULLs in C code...or maybe people just gave up (see next point).

... Inline::C already does that by providing us with the TYPEMAPS configuration option ... we can fix that using our own custom typemap.

Right, it could even be just a matter of a new module which install an extra typemap definition borrowed from the SWIG's one...if only that was doable in a reasonable manner: as your extensive analysis has showed in different part of the thread, it seems the original author was drunk when wrote the logic behind the whole "local typemap" option :P.
There are at least two bugs resulting in weird interactions when one tries to extend the very simple Perl default typemap to "override" something:
probably people instead used SWIG to handle NULL pointers.

Replies are listed 'Best First'.
Re^8: Inline::C and NULL pointers
by syphilis (Archbishop) on Dec 21, 2021 at 01:27 UTC
    If we agree that NULL pointers are a common feature of the C libraries, than I would expect transparent support for those by default.

    That "transparent support" can only be enacted by altering ExtUtils/typemap. The way to address this is to file an Issue with perl5. (Thanks etj for supplying the appropriate link.)
    I certainly agree that NULL pointers are a common feature of C libraries ... I'm not convinced that the need to pass NULL pointers from perl to C functions is all that common.

    it seems the original author was drunk when wrote the logic behind the whole "local typemap" option :P

    It seemed that way partly because of the way that things unfolded as I muddled my way through it. (You'd be excused for thinking that *I* was drunk. I wasn't, but I felt like I was by the end of it.)
    Based on what I know at the moment, I think the succinct way to describe the bug is:

    "User-supplied typemaps other than ./typemap and ../typemap cannot override the type settings specified in ExtUtils/typemap"

    I believe this is an ExtUtils::ParseXS bug that has been around for a long time - I see the same behaviour on perl-5.8.8.
    I would think it's very rare that a user wants to override an ExtUtils/typemap setting, and even rarer that a user would try to do that in a typemap other than ./typemap or ../typemap. So I'm not at all surprised that it has taken a long time to surface.

    I'll file a bug report, and update this thread with a link to it. Might take a day or two. I'll spend some time trying to work out how to fix it first.
    It might simply be that these problematic user typemaps are being prepended (instead of appended) to the list of typemaps. That would be consistent with what we're seeing, though the output I see with verbose (BUILD_NOISY) builds of the Inline::C scripts suggests that they are being appended.

    UPDATE:In ExtUtils::ParseXS::Utilities::process_typemaps() we find the following line of code:
    push @tm, standard_typemap_locations( \@INC );
    Prior to this push() any typemap specified by the user that is not in one of the standard typemap locations is already included in @tm.
    And this push() ensures that ExtUtils/typemap can be found *after* (and will override any conflicts with) that user-specified typemap.
    Changing the push() to an unshift() fixes that problem, enabling the user-specified typemap to take precedence - which is exactly what we want.
    However, I just need to determine what would be broken by that change. Another option is to stay with the push() and simply remove ExtUtils/typemap from the list returned by standard_typemap_locations(\@INC) - which I think might be a better approach if I can ascertain that ExtUtils/typemap is guaranteed to already be in @tm.

    Cheers,
    Rob
      Rob, I am glad that other Monks jumped in on this.

      I agree with this: I'm not convinced that the need to pass NULL pointers from perl to C functions is all that common.

      As I mentioned in previous posts, the OP is confronted with some poorly written C code that he can't change. This NULL pointer idea arises from an I/F that says: "if you give a pointer to memory, I will use that memory for output, assuming without question that you have given me enough memory for my yet to be generated output. If you don't give me such a pointer, I will give you a pointer to my non-thread-safe non-recursion-safe static memory." This is a bullshit I/F. But the OP can't change that.

      NULL as the 2nd param is a weird situation. If there could be one arg and an optional second arg, then normally this would be implemented with a variable number of args - you don't put NULL, for that second arg - it is simply not there at all for the caller. This of course requires different C code than what the OP is dealing with. printf() for example uses a variable number of arguments.

      I did have fun with Inline::C and found it to be "easy to use" for all the "heavy lifting" that is does.

        In general C programming, one doesn't have optional arguments. And generally, being able to give a pointer value that is clearly not intended as valid (a NULL) is a valuable thing. That maps very nicely for a Perl interface, to turning an undef input (or, conceivably, no input at all) into a NULL in C terms.

        tl;dr: C and Perl have different idioms, and while XS should probably take an SV* and treat it idiomatically, actual C has different needs.

      I certainly agree that NULL pointers are a common feature of C libraries ... I'm not convinced that the need to pass NULL pointers from perl to C functions is all that common.

      There are many discussions about what is the right approach and there seems to be not an established consensus from what I read around. But!
      Considering the nature of the tool is to glue to a different language interface, ( i.e. a collection of functions), then any source type you map into C/C++ will be used in the majority of cases as a function argument.

      Consider also that NULL as argument is not just a poor C programming practice because often you need a "special" value to signal conditions, sometimes is even required(!) e.g.: strtok(3)/snprintf(3)/gettimeofday(2).

      As SWIG already does, undef ==> NULL feels completely natural and I personally consider the lack in Inline as a bug. I've already solved by using SWIG as usual, maybe some Inline::C users could take from here and consider opening a related enhancement request.

        maybe some Inline::C users could take from here and consider opening a related enhancement request

        This is the bit I don't quite get.
        Precisely what, IYO, should that "enhancement request" be seeking ?
        At the moment, all I've got is that "undef, when passed from perl to unsigned char * C argument, should be NULL". (That problem, at least, is solved.)
        What if we're passing something other than "undef" ? Should that be T_PV or T_PTR ? ... or something else ?

        What happens when the C function unsigned char * foo() returns a NULL to perl ? Should that come back as undef ?
        With T_PV it returns undef; with T_PTR it returns the IV zero. But this behaviour can be manipulated to whatever we want in ExtUtils/typemap (or user-provided typemap).

        Just give me a clear spec, I'll write a patch to ExtUtils/typemap that enacts that spec,and, if it passes review here, I'll see if I can get perl porters to accept it.
        That's where the change would best be made.

        If the proposed change is unacceptable to them, then we can look at making the change in Inline::C by use of a customized typemap. (No guarantees that it will be accepted there, either ... we'll just have to wait and see.)
        But I first need to see a clear spec of the requirement, telling me exactly what needs to be changed.

        Cheers,
        Rob

        UPDATE: If the only thing we want to do is to ensure that "undef" is passed as NULL to a char * (either signed or unsigned) then I think we need to change the "INPUT" setting in ExtUtils/typemap for T_PV from:
        $var = ($type)SvPV_nolen($arg)
        to
        if (SvOK($arg)) $var = ($type)SvPV_nolen($arg); else $var = INT2PTR($type,SvIV($arg))
        We can also achieve the same effect by creating a file named "typemap" that contains:
        INPUT T_PV if (SvOK($arg)) $var = ($type)SvPV_nolen($arg); else $var = INT2PTR($type,SvIV($arg))
        That file needs to be placed in a location where it will automatically be recognized as a typemap.

        For example, place that typemap file in the same directory as this little Inline:C script:
        use strict; use warnings; use Devel::Peek; use Inline C => Config => BUILD_NOISY => 1, # verbose build FORCE_BUILD => 1, # re-build whenever the script is run CLEAN_AFTER_BUILD => 0, # don't clean up the build directory ; use Inline C =><<'EOC'; unsigned char * foo(unsigned char *name) { if(name) printf("name is: %s\n", name); else printf("NULL input (undef) was detected\n"); return(name); } EOC my $x = foo(undef); Dump($x); my $y = foo("hello world"); Dump($y); my $z = foo(''); Dump $z;
        Then cd to that directory, run the script and tell me if it's doing something undesirable.

        Of course unsigned char* is not the only thing that maps to T_PV - char*, const char*, caddr_t, wchar_t* and Time_t* all map to T_PV, and will therefore all be affected by that typemap.
        But, if need be, we could always create a new and distinct type for those that need to use this revised setting.

        To run that script with the settings specified by ExtUtils/typemap, just rename "typemap" to something else, and it will be ignored.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11139779]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (2)
As of 2024-04-26 01:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found