maasha has asked for the wisdom of the Perl Monks concerning the following question:

In order to create a compact table for recording scores fast, I want to cast a bytestring with pack (so that the object is under Perls memory management) and send it to a Inline::C function where this bytestring is typecasted as a 2d array and filled with scores. How can this be done? I have been trying stuff like this:

sub perl_routine
{
        $byte_string = pack "I*", (0) x ( $x_dimension x $y_dimension );

        c_function( $byte_string );
}

void c_function( char *byte_string )
{
        int **table = ( int ** ) byte_string;
        int i, j;

        for ( i = 0; i < foo, i++ )
        {
                for ( j = 0; j < bar, j++ )
                {
                        tableij++;
                }
        }
}

But I can't get it to work -> segfault. Assistance is most welcome! Thanks, Martin

Aaargh, the pre formatting does not work and brackets are eaten. the increment line is supposed to be tableij++

  • Comment on Passing a bytestring from Perl to Inline::C as a 2d array

Replies are listed 'Best First'.
Re: Passing a bytestring from Perl to Inline::C as a 2d array
by BrowserUk (Patriarch) on Nov 11, 2009 at 23:49 UTC

    You cannot cast a 1D array of integers (as produced by pack), to a 2D array of pointers to arrays of integers. You need to do the index calculations yourself (untested):

    void c_function( char *bytes, int rows, int cols ) { int *table = (int *)bytes; int i, j; for ( j = 0; j < cols, ++j ) { for ( i = 0; j < rows; ++i ) { ++table[ j * rows + i ]; } } }

    Also, please use <code> ... </code> tags not <pre> ... </pre> tags


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Passing a bytestring from Perl to Inline::C as a 2d array
by happy.barney (Friar) on Nov 11, 2009 at 21:06 UTC
    2d array in C is array of pointers to 1d array. So try rewrite your cycle to something like this:
    for (i=0; i<foo; i++) { for (j=0; j<bar; j++) { table[i*bar + j]++; } }
Re: Passing a bytestring from Perl to Inline::C as a 2d array
by ikegami (Patriarch) on Nov 12, 2009 at 00:51 UTC
    Here's what it would look like if you actually created a 2d array:
    sub perl_routine { my @array = map { pack "I*", (0) x $x_sz } 1..$y_sz; my $array = pack "P"x@array, @array; c_function( $array, $y_sz, $x_sz ); } void c_function( char *arg, int y_sz, int x_sz ) { int ** const table = ( int ** )arg; int y, x; for ( y = 0; i < y_sz, y++ ) { for ( x = 0; x < x_sz, x++ ) { ++table[y][x]; } } }

    It's odd that P* doesn't work

Re: Passing a bytestring from Perl to Inline::C as a 2d array
by Marshall (Canon) on Nov 11, 2009 at 23:03 UTC
    Update: I looked at this requirement post again: "In order to create a compact table for recording scores fast,....". What I wrote below about C isn't wrong. But I am now of the opinion that you don't even need C until you show otherwise... Please explain more about your app and why you think that you even need to interface with 'C'? What kind of benchmarks have you done? Do you need help on that part? Often the the right algorithm and data structure is more important than the raw speed of the language. So, I think now that you should explain why you even need 'C'?
    end of update

    I don't know the answer to the Perl<->C interface question. But your C code looks strange to me.

    I have no idea what this "foo" and "bar" stuff is about. In C, practical 2-D arrays are built as array of pointer to "type" and that pointer array is NULL terminated. In this case, pointer to pointer to char. The "built-in" C 2D array is pretty much worthless in practical applications.

    Anyway what you are describing from the C point of view is like exactly like "argv". I don't see how your C code even compiles! This is a Perl forum, but I would think a first step to interface Perl and C together is to be able to write working C code! Below, dump_strings() will dump "argv" or the "data" structure.

    #include <stdio.h> #include <stdlib.h> #include <string.h> int main(int argc, char **argv) { void dump_strings( char **byte_strings ); char *data[] = {"some stuff", "more stuff", "and third stuff", NULL }; dump_strings(data); dump_strings(argv); exit(0); } void dump_strings ( char **byte_strings ) { for ( ; *byte_strings; byte_strings++) { printf ("%s\n",*byte_strings); /* a per char loop would go here */ } } /* prints: some stuff more stuff and third stuff */
      In C, practical 2-D arrays are built as array of pointer to "type" and that pointer array is NULL terminated. In this case, pointer to pointer to char. The "built-in" C 2D array is pretty much worthless in practical applications.

      Actually, NULL termination is almost invariably a terrible idea. It wastes space, because you should know how long the array is. If you don't know how long the array is, then you can only safely access it sequentially; if you need random access, this means you have to calculate the length anyway, by scanning the whole thing looking for the NULL, before you can access it at all. This is an unnecessary complication when you could just use that extra word of memory to store the length of the array in the first place.

      To be honest, storing as pointer-to-pointer is also often an unnecessary complication. It's very easy to mess up memory management when nesting pointers.

        Well, actually NULL termination is almost always the right idea with a array of pointers and is THE WAY to pass matrices around in 'C'. I will grant you some exceptions, but they are few, very few. I find your claim rather odd.

        Update: First claim: wastes space: Nonsense!

        Memory is usually allocated in increments of (8 or 16 pointers)x X, where X >=1. On my machine (Win XP), a pointer takes the same size as an int. Memory is allocated in hunks of 128 bytes or 32 "words" of 4 bytes each. So the storage of the NULL pointer will require one more allocation unit 1 of 32 times with random number of things in a list. Storage of a "count" will require at least that much and probably more because now we have an "extra thing" in addition to the pointers to data. Now we get into a an "object" in OO-ese and depending upon how this is implemented, this could wind up taking a lot more storage to represent a 2D array!

        Second Claim: if you need random access, this means you have to calculate the length anyway, by scanning the whole thing looking for the NULL, before you can access it at all.


        Well of course not! If "you" made this structure, with the intention of using/modifying it, you will know how "big it is". I will show some code below that accesses a char** array. 'C' doesn't have any limits on how array indices are calculated. C assumes that you "know what you are doing", and you can screw-up massively!

        Update: this code below does screw-up. It works because the data array was declared and assigned all at once. Like I said above: you can screw-up massively!". I did it also! In general each "row" will not have a correlation with another "row". See how this can fool me?(and you). And is exactly to my point of using pointers to traverse a char** structure instead of indicies. This doesn't mean that "random access" isn't possible..it is! Just like below. BUT you have to know how many things are on THAT ROW, ie the number of COLUMNS for that row.

        Many matrices have the same number of columns for each row, a char ** usually does not. The determination of number of rows is a trivial thing...just like in Perl! If the rows are "ragged", non-equal number of columns, then things get more complex...just like in Perl!

        include <stdio.h> int main() { char *data[] = {"some stuff", "ABCEDFGHIJ", "and third stuff", NULL }; printf ("%c\n",data[0][28]); /*prints "i", in third*/ printf ("%c\n",data[2][7] ); /*prints "r", in third*/ printf ("%c\n",data[2][6] ); /*prints "i", in third*/ printf ("%c\n",data[1][23]); /*prints "u", in stuff*/ printf ("%c\n",data[2][-9]); /*prints "C", in ABC...*/ printf ("%c\n",data[1][-5]); /*prints "t", in stuff*/ return(0); }
Re: Passing a bytestring from Perl to Inline::C as a 2d array
by jethro (Monsignor) on Nov 11, 2009 at 21:58 UTC
    You can find information which HTML tags to use all around the input box of the perlmonks website, just open your eyes.

    In short, dont use <pre>, use <code> tags for code

Re: Passing a bytestring from Perl to Inline::C as a 2d array
by Anonymous Monk on Nov 13, 2009 at 10:48 UTC

    Thank you all for your input.

    I know that this problem may be solved in Perl or PDL, but now I am interested in exploring how it can be solved with Inline::C. I need speed, and I need compact.

    Next, I see no problem in casting a bytestring using pack from Perl as a way of allocating a block of memory to be used in C as long as the block size is sufficient. That way the block of memory will be under Perl's memory management and garbage collection.

    As I understand, you can overrule the Perl type from C by type casting, so the pack bytestring does not have to be an array of pointers.

    Now, what is the problem with The "built-in" C 2D array? I should say that is exactly what I want - a 2D table of numbers. However, I will try my own index calculations as suggested.

    Here is the code mock-up again - this time in code tags :o).

    sub perl_routine { $byte_string = pack "I*", (0) x ( $x_dimension x $y_dimension ); c_function( $byte_string, $x_dimension, $y_dimension ); } void c_function( char *byte_string, int x_dimension, y_dimension ) { int **table = ( int ** ) byte_string; int i, j; for ( i = 0; i < x_dimension, i++ ) { for ( j = 0; j < y_dimension, j++ ) { table[ i ][ j ]++; } } }

    Martin

      pack "I*", (0) x ( $x_dimension x $y_dimension );

      int **table = ( int ** ) byte_string;

        table[ i ][ j ]++;

      That still won't work and cannot be made to work! You will still get segfaults.

      You cannot cast a 1D array point to a 2D array of arrays pointer. Full stop.

      The memory for a 1D array (as produced by pack) is laid out so:

      ----------------------------....--------------------- ...|int1|int2|int3|int4|int4|....|N-3 |N-2 |N-1 |N | ----------------------------....---------------------

      For a 2D array like so:

      ---- ptr | This is the int** ---- ... it points to an array of pointer below ... with one pointer for each of the first dimension --------------------------------------- ptr0|ptr1|ptr2|ptr3|....|Y-2 |Y-1 |Y | Each if these is an int * --------------------------------------- ... And each of those pointer point to contingous 1D arrays of ints ... each the size of the second dimension ... that can (and usually are) distributed at disparate position in me +mory ... and in no particular positions, even before the above block of ram +. --------....-----.....-----...etc |ary4| |ary2| |ary0| --------....-----.....-----...etc

      There is simply no way to cast from one to the other.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        I think BrowserUk is spot on! There is a problem with the C code and the interface.

        I am still troubled by the whole concept of the OP's approach. Instead of packing some things ("chars") into words for C to access on a byte per byte basis, run some regex or other Perl operation on those "things" and THEN run pack() if you need to! I see no need for 'C' here.

        Perl has a very efficient char per char replacement operator called "tr". And also, 'a' + 1 REALLY does mean "b"! Just like in 'C'. I think that if we get to the real requirement, we will find that Perl will do it just fine. The sub() appears trivial enough that I see no need to go through the hassle of interfacing 2 languages.