Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Inline::C and NULL pointers

by markong (Pilgrim)
on Dec 19, 2021 at 00:09 UTC ( [id://11139712]=perlquestion: print w/replies, xml ) Need Help??

markong has asked for the wisdom of the Perl Monks concerning the following question:

Hello!

I have to investigate the functionality of a C library and while in the past I always used SWIG to interface with C/C++ code, I am now trying to use Inline::C instead to save time on the "interface compilation" phase, seen that I'm only interested in experimenting with the C functions and copy/paste is not a problem.

All is good except I don't see how to "pass a NULL pointer from Perl". SWIG resolves by passing undef but Inline::C explodes on that, e.g.:

use v5.10; use Inline 'C'; say "At least we have compiled a bit of C code!"; my $ret= "HD3"; my $output = enctypex_msname("HD2", undef); say "Output from enctypex_msname() call: '$output', \$ret is '$ret'"; __END__ __C__ unsigned char *enctypex_msname(unsigned char *name, unsigned char *ret +name) { static unsigned char msname[256]; unsigned i, c, server_num; if(!name) return(NULL); server_num = 0; for(i = 0; name[i]; i++) { c = tolower(name[i]); server_num = c - (server_num * 0x63306ce7); } server_num %= 20; if(retname) { snprintf(retname, 256, "%s.ms%d.host.com", name, server_num); return(retname); } snprintf(msname, sizeof(msname), "%s.ms%d.host.com", name, server_ +num); return(msname); }

This results in:

At least we have compiled a bit of C code!
Use of uninitialized value in subroutine entry at script/decode_test.pl line 12.
Segmentation fault (core dumped)
I see that SWIG provides some custom type mappings for all the bindings, so unless I re-use those I am probably out of luck and have to write one for my use case? I wonder how it comes that nobody has ever had to pass a NULL pointer or have I overlooked something in Inline::* docs ?
Any hints?

Replies are listed 'Best First'.
Re: Inline::C and NULL pointers
by Marshall (Canon) on Dec 19, 2021 at 06:44 UTC
    I really don't have much experience with inline::C, but I took a look at your problem. It appears to me that you really don't want an "always mapping" of undef to NULL pointer because maybe undef in another context might mean a missing int in an array of ints or something else. Here in this particular case, undef means a NULL C pointer to char and I think you have to check for that explicitly.

    I wrote some simple test code below. In this case, I used SV* instead of char*. The C code figures out if the SV contains a valid pointer to string or not? I tested with both NULL undef and "asdf" as test inputs - see below. I suspect that something is missing in the case of UTF-8 strings. Also my simple test probably should be more complex because say passing an int would be treated the same as sending undef (the test is just valid string or not?).

    Hope that this is on the right track...
    Update: I guess the obvious was unsaid: why not use "" instead of undef? A null string is not undef.

    use v5.10; use Inline 'C'; say "At least we have compiled a bit of C code!"; my $ret= "HD3"; my $output = enctypex_msname("HD2", undef); say "Output from enctypex_msname() call: '$output', \$ret is '$ret'"; =OUTPUT CASE 1: with: my $output = enctypex_msname("HD2", "asdf"); At least we have compiled a bit of C code! Output from enctypex_msname() call: 'HD2', $ret is 'HD3' HD2 asdf name 0000000002D2BC58 retname 0000000002D2BB68 CASE 2: my $output = enctypex_msname("HD2", undef); At least we have compiled a bit of C code! Output from enctypex_msname() call: 'HD2', $ret is 'HD3' HD2 (null) name 0000000002C93298 retname 0000000000000000 =cut __END__ __C__ unsigned char *enctypex_msname(char *name, SV* retname) { char * retname_string = SvPOK(retname) ? SvPV_nolen(retname) : NUL +L; printf ("%s %s\n",name,retname_string); printf ("%s %p %s %p\n", "name ",name, "retname ",retname_string); return name; }
    PS again on the C code itself:  "static unsigned char    msname[256];"could be trouble. There is only one msname array. If you call this function again, your code will return the same address for msname as it did the last time. Also perhaps if(!name) return(NULL); should be if(! *name) return(NULL); "" does not mean NULL it means pointer to array of char, if first char is \0 then string is "empty" - not the same as a NULL pointer.
Re: Inline::C and NULL pointers
by Marshall (Canon) on Dec 19, 2021 at 09:32 UTC
    My previous post covered one way of "how to do it". This is a separate post because the focus is on "why are you doing that?" and a re-coding of your original code.

    I did some bit of "mind reading" and re-coded your C function. It looks like you wanted 2 different ways for your function to return the result? If you gave it a "ret string", you wanted that string modified "in place"? Otherwise you wanted a new result to be passed to $output? Very confusing to me. Please un-confuse me.

    Anyway, I don't see the need for this "ret" input parm at all. Take in a string, do something that generates another string and return that string. What is the need for this second input string? It is possible to modify a string "in place", but that would require the caller to provide enough memory so that when the string becomes longer, it does not overflow allocated memory. I have no idea about the details of your string transformation equation - I leave that detail to you. Here is my version below. There is one way to get the input and one way to get the output.

    use v5.10; use Inline 'C'; say "At least we have compiled a bit of C code!"; my $output = enctypex_msname("HD2"); say "Output from enctypex_msname() call: $output\n"; =OUTPUT: At least we have compiled a bit of C code! Output from enctypex_msname() call: HD2.ms14.host.com =cut __END__ __C__ unsigned char *enctypex_msname (unsigned char *name) { unsigned char temp[256]; unsigned int c; unsigned char* mover; if(!*name) return(NULL); /* empty string "" */ /* not sure if this will result in undef? +? */ unsigned int server_num = 0; for( mover = name; *mover; mover++) { c = tolower(*mover); server_num = c - (server_num * 0x63306ce7); /*some comment app +ropriate here!!*/ } server_num %= 20; snprintf(temp, sizeof(temp), "%s.ms%d.host.com", name, server_num) +; int length_of_temp_string = strlen(temp)+1; char* ret_buf = (char*)malloc(length_of_temp_string); //could just + allocate 256, so what? if (!ret_buf) { perror("malloc"); exit(1); } memcpy(ret_buf, temp, length_of_temp_string); return(ret_buf); }
    It is apparent that the malloc'ed memory gets passed as an SV to $output. At that point, I don't know for sure how this works, but Perl would have to be responsible for calling "free()" on that pointer when the memory for $output is not needed anymore.

    I didn't allocate more memory than needed, but that detail doesn't matter much. What does matter is not "re-using memory in a way that is opaque to the user". That is what your version with "static" would do. I recommend against that.

    Update: Geez, I am updating a lot tonight. My brain is working on a different project at the moment...But the thought does occur that I don't see anything in this code that would warrant a C subroutine (there is no performance issue). If you explain more about the string transformation algorithm, I am confident that this could be written in native Perl code. In my testing, it does take some extra time to invoke the C compiler and any generated messages are a bit harder to understand than it would be with a native C program. A "pure Perl" implementation would start faster and run time would make no difference. But I get it that you are playing with various options of passing parameters and getting results.

    Well as I continue to learn...Update...it is not necessary to do this malloc() stuff...it is possible to create/allocate memory for a new SV string object in one C statement as shown below. This stuff is complex. To say the least!

    use v5.10; use Inline 'C'; say "At least we have compiled a bit of C code!"; my $output = enctypex_msname("HD2"); say "Output from enctypex_msname() call: $output\n"; =OUTPUT: At least we have compiled a bit of C code! Output from enctypex_msname() call: HD2.ms14.host.com =cut __END__ __C__ SV *enctypex_msname (unsigned char *name) { unsigned int c; unsigned char* mover; if(!*name) return(NULL); /* empty string "" */ /* not sure if this will result in undef? +? */ unsigned int server_num = 0; for( mover = name; *mover; mover++) { c = tolower(*mover); server_num = c - (server_num * 0x63306ce7); /*some comment app +ropriate here!!*/ } server_num %= 20; return (newSVpvf("%s.ms%d.host.com", name, server_num)); // mall +oc is here... // temp + allocation here too }

      Hi and thanks for both the replies!

      I should have probably been more explicit about this: the function you're looking at (enctypex_msname) is just an "utility" extracted from a big file which encode/decode data, which has been written by a third party.
      The C is quite a mess (both in terms of structuring and often uses contrived logic) and the author uses those  if (name_of_pointer_char) {} tests to see if something was passed into the function. I have no control over the code, or rather I'd like to stay away from it as much as possible :).
      But I need to decode some data and make some tests and doing it from Perl will be a lot faster: I won't rewrite the code to be "XS compatible" because that doesn't make sense (I would rather try to rewrite the thing directly in Perl) but I need to pass "NULL" pointers to the functions and unless there's some quick recipe for doing that with Inline::C, I'd better write a swig interface for the library and pass undef as needed.

        but I need to pass "NULL" pointers to the functions

        How to do that is an interesting puzzle. I've been playing with this little Inline::C demo:
        use strict; use warnings; use Inline C => Config => BUILD_NOISY => 1, ; use Inline C =><<'EOC'; unsigned char * foo(unsigned char * name) { if(name) printf("True\n"); else printf("False\n"); return(NULL); } unsigned char * bar() { return (foo(NULL)); } EOC foo(undef); bar(); __END__ Outputs: True False
        The question is: "How can I call foo() directly from perl such that it will output "False" ? One way is to call bar() which then calls foo() - and I think that the OP's existing enctypex_msname function could be accessed (as is) via this technique of calling a wrapper function . But that's still not a direct call to foo().

        AFAICT, the issue is the typemapping of "unsigned char *" (which can be found in perl's lib/ExtUtils/typemap file).
        This default typemapping types "unsigned char *" to T_PV, but if we change that to T_PTR, then calling foo(undef) in my little demo will have it output "False" as desired. (I expect this is what SWIG, in effect, does.)

        Inline::C allows us to use our own typemaps, so I wrote an alternative typemap for unsigned char * and I placed that file (named "nullmap.txt") in the same directory as the script, and pointed the script to it (in the inline configuration section). Unfortunately, the default typemapping was still used.
        So ... as proof of concept, I replaced "unsigned char *" in nullmap.txt with "unmapped". It looks like this:
        # plagiarised from default perl typemap unmapped T_PTR INPUT T_PTR $var = INT2PTR($type,SvIV($arg)) OUTPUT T_PTR sv_setiv($arg, PTR2IV($var));
        Then, in the script, I typedeffed the "unsigned char *" to "unmapped", and replaced "unsigned char *" with "unmapped" in the function declaration:
        use strict; use warnings; use Inline C => Config => TYPEMAPS => "nullmap.txt", BUILD_NOISY => 1, ; use Inline C =><<'EOC'; typedef unsigned char * unmapped; unmapped foo(unmapped name) { if(name) printf("True\n"); else printf("False\n"); return(NULL); } unsigned char * bar() { return (foo(NULL)); } EOC foo(undef); bar(); __END__ Outputs: False False
        So we have 2 essentially identical scripts that output different results. One of them specifies an "unmapped" type where the other specifies "unsigned char *" - yet the two types are the same thing. The difference occurs because the two types employ different typemapping.

        Note that it should not be necessary to replace *all* occurrences of "unsigned char *" with "unmapped" (or whatever replacement name is chosen) - just renaming those occurrences found in the declaration would be all that's needed.
        And then there's the question of whether both the return type and the argument type should be rewritten to "unmapped". (Perhaps it's only the argument type that needs to be renamed.)
        Also, when fiddling around with Inline scripts, we need to remember that a change to the Inline Config section will not trigger a recompilation of the C code unless that Config section specifies FORCE_BUILD => 1. Otherwise, it's necessary to alter the actual code section.

        I'm surprised that specifying an alternative Inline::C typemap doesn't override the default typemap for any types that are specified in both.
        Is that an Inline::C bug ? Or a design flaw ?
        Maybe it's the way it has to be.

        Cheers,
        Rob
        Ok, I have been learning myself! Geez this question got me going this late evening... I have been having fun playing with this although verbose in my incremental discoveries. Nobody else was answering so I thought I'd have a go at it! The basic issue appears to be:
        FuncA("abc",undef,"xyz"); FuncA("abc",3,"xyz"); FuncA("abc","nmo","xyz +"); FuncB(1,undef,3); FuncB(1,"some string",3);
        C doesn't have the concept of "undef". It appears to me that you will have to tell C how undef should be interpreted - whether in this particular context it means a string, int or whatever. In the preceding, the C code would have to know whether undef as the second param means a "missing" string or an int. Maybe for an expected int, undef defaults to a zero? or maybe -1 or maybe 99999? But it appears to me that would have to be within the C code, not in any general mapping of Perl to C? Because undef could mean many different things to C depending upon the context. I don't know how SWIG would help?

        So as my first example post code shows, instead of char* in the C function definition, you need SV*. undef can be passed as an SV value. There is a C function to determine the exact Perl "type" of that SV value (I didn't show that, but there is such a function). My example just showed a simple test for "does this SV structure contain a Perl string or not?". I have seen more complex tests often with complicated || logic. But my first test code shows a simple test for whether the SV contains a valid string within its structure or not and how to deal with the result of that test.

        You will have to modify the C function definition to have a pointer to SV (SV*) instead of pointer to character (char*). As my code shows, if that SV doesn't have a valid string, set pointer equal to NULL. If that SV does contain a string, then extract the pointer to that string. When you have a simple char*, this translation is done automatically for you. But undef throws a wrinkle into the works.

        I am sure there are some really wild border cases like Perl passing, "123": That is a string but Perl would treat it as an int if used in a math operation.

        Yet another thought:
        As I think about this, you are copy and pasting this guy's C code into an inline::C Perl section mainly to reduce "start time". You want to call this guy's C code from Perl. But you don't want to modify this guy's code. So instead of calling this guy's code directly, you could write a "wrapper" C function, which you call from Perl. That C wrapper resolves these undef issues and then calls that guy's code. His code is never called with an undef value.

        You write:

        unsigned char *my_enctypex_msname(unsigned char *name, SV *ret){ this C code handles undef appropriately and then calls the "copy and paste" enctypex_msname(unsigned char *name, cha +r* ret){} }
        Your Perl code calls the "my function".
        It might be worth submitting a bug report to Inline::C about this. As an occasional user of Inline::C, I'm surprised that undefs wouldn't become NULL; that's the behavior I'd want too. I'm surprised I never ran into that before.
Re: Inline::C and NULL pointers
by Chrysotoxum (Novice) on Dec 19, 2021 at 11:38 UTC

    Here's a little program illustrating one way of using undef as a 'NULL pointer' parameter in Inline C.

    use strict; use warnings; my $point = pack 'f2', 1.2, 2.3; func(\$point); func(undef); # pass in 'NULL pointer' use Inline C => <<'C_END'; typedef struct { float x, y; } Point; void func(SV *buf) { SV *point_ref = SvRV(buf); if (!point_ref) { printf("got a null pointer\n"); } else { /* dereference */ Point *point = (Point *)SvPV_nolen(point_ref); printf("x=%.2f, y=%.2f\n", point->x, point->y); } } C_END

    Hope this helps.

      Rather than requiring passing a reference, it would be better to just pass a normal scalar and use SvOK to see if it is defined rather than what looks a bit contrived with the SvRV. As it is, your code does not handle the case of a non-ref value.

        Just updated the program to take on board your helpful critique.

        use strict; use warnings; my $point = pack 'f2', 1.2, 2.3; func(undef); # pass in 'NULL pointer' func($pointer); Inline C => <<'C_END'; typedef struct { float x, y; } Point; void func(SV *buf) { if (!SvOK(buf)) { printf("got a null pointer\n"); } else if (SvPOK(buf) && SvCUR(BUF) == sizeof(Point)) { Point *point = (Point *)SvPV_nolen(buf); printf("x=%.2f, y=%.2f\n"); } else { croak("invalid scalar"); } } C_END
Re: Inline::C and NULL pointers
by swl (Parson) on Dec 21, 2021 at 00:32 UTC

    Posts in the rest of this thread suggests you are not wedded to using Inline:C. If this is the case then it is worth having a look at FFI::Platypus.

    A search of the FFI::Platypus::Type docs for "NULL pointer" gives a few examples that suggest undefs can be automatically mapped to NULLs.

    There are also quite a few modules on CPAN that use FFI::Platypus for these sorts of bindings, so plenty of prior art that can be adapted.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11139712]
Approved by Marshall
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (5)
As of 2024-04-16 17:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found