in reply to Passing a bytestring from Perl to Inline::C as a 2d array

Update: I looked at this requirement post again: "In order to create a compact table for recording scores fast,....". What I wrote below about C isn't wrong. But I am now of the opinion that you don't even need C until you show otherwise... Please explain more about your app and why you think that you even need to interface with 'C'? What kind of benchmarks have you done? Do you need help on that part? Often the the right algorithm and data structure is more important than the raw speed of the language. So, I think now that you should explain why you even need 'C'?
end of update

I don't know the answer to the Perl<->C interface question. But your C code looks strange to me.

I have no idea what this "foo" and "bar" stuff is about. In C, practical 2-D arrays are built as array of pointer to "type" and that pointer array is NULL terminated. In this case, pointer to pointer to char. The "built-in" C 2D array is pretty much worthless in practical applications.

Anyway what you are describing from the C point of view is like exactly like "argv". I don't see how your C code even compiles! This is a Perl forum, but I would think a first step to interface Perl and C together is to be able to write working C code! Below, dump_strings() will dump "argv" or the "data" structure.

#include <stdio.h> #include <stdlib.h> #include <string.h> int main(int argc, char **argv) { void dump_strings( char **byte_strings ); char *data[] = {"some stuff", "more stuff", "and third stuff", NULL }; dump_strings(data); dump_strings(argv); exit(0); } void dump_strings ( char **byte_strings ) { for ( ; *byte_strings; byte_strings++) { printf ("%s\n",*byte_strings); /* a per char loop would go here */ } } /* prints: some stuff more stuff and third stuff */

Replies are listed 'Best First'.
Re^2: Passing a bytestring from Perl to Inline::C as a 2d array
by Porculus (Hermit) on Nov 12, 2009 at 00:13 UTC
    In C, practical 2-D arrays are built as array of pointer to "type" and that pointer array is NULL terminated. In this case, pointer to pointer to char. The "built-in" C 2D array is pretty much worthless in practical applications.

    Actually, NULL termination is almost invariably a terrible idea. It wastes space, because you should know how long the array is. If you don't know how long the array is, then you can only safely access it sequentially; if you need random access, this means you have to calculate the length anyway, by scanning the whole thing looking for the NULL, before you can access it at all. This is an unnecessary complication when you could just use that extra word of memory to store the length of the array in the first place.

    To be honest, storing as pointer-to-pointer is also often an unnecessary complication. It's very easy to mess up memory management when nesting pointers.

      Well, actually NULL termination is almost always the right idea with a array of pointers and is THE WAY to pass matrices around in 'C'. I will grant you some exceptions, but they are few, very few. I find your claim rather odd.

      Update: First claim: wastes space: Nonsense!

      Memory is usually allocated in increments of (8 or 16 pointers)x X, where X >=1. On my machine (Win XP), a pointer takes the same size as an int. Memory is allocated in hunks of 128 bytes or 32 "words" of 4 bytes each. So the storage of the NULL pointer will require one more allocation unit 1 of 32 times with random number of things in a list. Storage of a "count" will require at least that much and probably more because now we have an "extra thing" in addition to the pointers to data. Now we get into a an "object" in OO-ese and depending upon how this is implemented, this could wind up taking a lot more storage to represent a 2D array!

      Second Claim: if you need random access, this means you have to calculate the length anyway, by scanning the whole thing looking for the NULL, before you can access it at all.


      Well of course not! If "you" made this structure, with the intention of using/modifying it, you will know how "big it is". I will show some code below that accesses a char** array. 'C' doesn't have any limits on how array indices are calculated. C assumes that you "know what you are doing", and you can screw-up massively!

      Update: this code below does screw-up. It works because the data array was declared and assigned all at once. Like I said above: you can screw-up massively!". I did it also! In general each "row" will not have a correlation with another "row". See how this can fool me?(and you). And is exactly to my point of using pointers to traverse a char** structure instead of indicies. This doesn't mean that "random access" isn't possible..it is! Just like below. BUT you have to know how many things are on THAT ROW, ie the number of COLUMNS for that row.

      Many matrices have the same number of columns for each row, a char ** usually does not. The determination of number of rows is a trivial thing...just like in Perl! If the rows are "ragged", non-equal number of columns, then things get more complex...just like in Perl!

      include <stdio.h> int main() { char *data[] = {"some stuff", "ABCEDFGHIJ", "and third stuff", NULL }; printf ("%c\n",data[0][28]); /*prints "i", in third*/ printf ("%c\n",data[2][7] ); /*prints "r", in third*/ printf ("%c\n",data[2][6] ); /*prints "i", in third*/ printf ("%c\n",data[1][23]); /*prints "u", in stuff*/ printf ("%c\n",data[2][-9]); /*prints "C", in ABC...*/ printf ("%c\n",data[1][-5]); /*prints "t", in stuff*/ return(0); }