in reply to Re^5: Perl XS binding to a struct with an array of chars* in thread Perl XS binding to a struct with an array of chars*
Once I start, I can't stop ;-)
I still don't know how to deal with:
struct _Edje_Message_String_Set
{
int count;
char * str[];
};
Is it possible to handle that ?
However, for:
struct _Edje_Message_String_Set
{
int count;
char ** str;
};
I have:
# struct_char.pl #
use strict;
use warnings;
use Inline C => Config =>
BUILD_NOISY => 1,
CLEAN_AFTER_BUILD => 0,
USING => 'ParseRegExp',
;
use Inline C => <<'EOC';
struct _Edje_Message_String_Set
{
int count;
char ** str;
};
typedef struct _Edje_Message_String_Set EdjeMessageStringSet;
void struct_size(void) {
printf("Size of _Edje_Message_String_Set struct: %d\n",
sizeof(EdjeMessageStringSet) );
}
EdjeMessageStringSet * _new(AV * val_arr) {
EdjeMessageStringSet *message;
int i;
SV ** elem;
Newx(message, 1, EdjeMessageStringSet);
if(message == NULL)
croak("Failed to allocate memory in _new function");
message->count = av_len(val_arr) + 1;
Newx(message->str, message->count, char*);
for(i = 0; i < message->count; i++) {
elem = av_fetch(val_arr, i, 0);
message->str[i] = SvPV_nolen(*elem);
}
return message;
}
void _iterate(EdjeMessageStringSet * strs) {
int i;
for(i = 0; i < strs->count; i++) {
printf("%s\n", strs->str[i]);
}
}
void DESTROY(EdjeMessageStringSet * x) {
Safefree(x->str);
printf("Safefreed EdjeMessageStringSet object->str\n");
Safefree(x);
printf("Safefreed EdjeMessageStringSet object\n");
}
void foo(AV * arref) {
EdjeMessageStringSet *m;
m = _new(arref);
_iterate(m);
DESTROY(m);
}
EOC
struct_size();
my @in = ("hello foo", "hello bar","hello world", "goodbye", '1', '2',
+ '3');
# The XSub foo() will create a new EdjeMessageStringSet object
# using _new(), then pass that object to _iterate() which
# prints out all of the strings contained in the object.
# Finally, foo() calls DESTROY() which frees the memory that
# was assigned to create the EdjeMessageStringSet object.
foo(\@in);
Which (after compiling) outputs:
Size of _Edje_Message_String_Set struct: 16
hello foo
hello bar
hello world
goodbye
1
2
3
Safefreed EdjeMessageStringSet object->str
Safefreed EdjeMessageStringSet object
Cheers, Rob
Re^7: Perl XS binding to a struct with an array of chars*
by Marshall (Canon) on Nov 26, 2022 at 02:32 UTC
|
struct _Edje_Message_String_Set
{
int count;
char * str[];
};
Is it possible to handle that ?
Yes, it sure is!
I actually decided that this is the best way to implement the idea of the array of pointer within the struct itself. My reasoning is that the [] gives a big clue that there is no storage in the struct at all for the array of pointers to strings.
In my implementation below, I allocated one more slot for a NULL pointer. That is so that more traditional C pointer iteration style can be used as an alternative to using some counter based upon the "count".
Also not that the Perl print statements come before the C print statements! That of course has to be due some buffering weirdness, but I didn't figure out how to defeat that behaviour. Also for some reason, the API function, av_count() wasn't available on my version so a simple workaround calculation was used.
#MessageStorageInsideStruct 11/25/2022 #
#https://www.perlmonks.org/?node_id=11148268 #
# #
use strict;
use warnings;
use Inline C => Config =>
BUILD_NOISY => 1,
CLEAN_AFTER_BUILD => 1,
USING => 'ParseRegExp',
;
use Inline "C";
struct_size();
my @in = ("hello foo", "hello bar","hello world", "goodbye", '1', '2',
+ '3');
foo(\@in);
print "back inside Perl program!!!!\n";
print "printout is not in time order you expected!\n\n";
=EXAMPLE RUN
back inside Perl program!!!!
printout is not in time order you expected!
Size of _Edje_Message_String_Set structure: 8 bytes
Address of the Message Struct: 000000000308FAB8
Address of the Pointer Array: 000000000308FAC0
setting element 0
string: hello foo's pointer is at Address 000000000308FAC0
setting element 1
string: hello bar's pointer is at Address 000000000308FAC8
setting element 2
string: hello world's pointer is at Address 000000000308FAD0
setting element 3
string: goodbye's pointer is at Address 000000000308FAD8
setting element 4
string: 1's pointer is at Address 000000000308FAE0
setting element 5
string: 2's pointer is at Address 000000000308FAE8
setting element 6
string: 3's pointer is at Address 000000000308FAF0
Dumping an EdjeMessageStringSet
Count = 7
1) hello foo
2) hello bar
3) hello world
4) goodbye
5) 1
6) 2
7) 3
Destroying an EdjeMessageStringSet
=cut
__END__
__C__
/*********** Start of C Code ********/
struct _Edje_Message_String_Set
{
int count; //On 64 bit machine, this is 8 bytes
char *str[]; //str has no "size" and is not counted in sizeof(_Edj
+e_Message_String_Set)
};
typedef struct _Edje_Message_String_Set EdjeMessageStringSet;
void struct_size(void) {
printf("Size of _Edje_Message_String_Set structure: %d bytes\n",
sizeof(EdjeMessageStringSet) );
}
/************/
EdjeMessageStringSet * _new(AV* val_arr)
{
int count = av_len(val_arr) + 1; // newer av_count() not avail thi
+s version
// sizeof(Edje_Message_String_Set) only has the integer,count of 8 b
+ytes
// space for the array of pointers must be allocated by safemalloc
// remember to add one more slot for a NULL pointer
EdjeMessageStringSet* message = (EdjeMessageStringSet*) safemalloc
( sizeof(EdjeMessageStringSet) + (
+count+1)*sizeof(char*) );
printf ("Address of the Message Struct: %p\n",message);
+
if(message == NULL)
croak("Failed to allocate memory for message in _new function")
+;
message->count = count;
char** p = &(message->str[0]);
printf ("Address of the Pointer Array: %p\n", p);
int i;
for(i= 0; i < message->count; i++) {
printf ("setting element %d\n",i);
SV** elem = av_fetch(val_arr, i, 0);
if (elem==NULL) croak ("bad SV elem value in _new function");
char* string = SvPVutf8_nolen(*elem);
printf ("string: %s's pointer is at Address %p\n",string,p);
*p++ = savepv(string);
}
*p = NULL; //Can use either count or NULL pointer as a loop variab
+le
return message;
}
/******************/
void _iterate (EdjeMessageStringSet* m)
{
printf ("Dumping an EdjeMessageStringSet\n");
printf ("Count = %d\n",m->count);
int i = 1;
char** p = &(m->str[0]);
while (*p) {printf ("%d) %s\n",i++,*p++);}
}
void DESTROY(EdjeMessageStringSet* m)
{
printf ("Destroying an EdjeMessageStringSet\n");
char** p = &(m->str[0]);
while(*p){Safefree(*p++);} //zap eaxh cloned string
Safefree(m); //zap main structure
}
/************/
void foo(AV * arref)
{
EdjeMessageStringSet* m = _new(arref);
_iterate(m);
DESTROY(m);
}
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
Nice work !!
I'll draw attention to an oddity I've just noticed, of perhaps little significance.
Regarding the struct definition and typedef:
struct _Edje_Message_String_Set
{
int count; //On 64 bit machine, this is 8 bytes
char *str[]; //str has no "size" and is not counted in sizeof(_Edj
+e_Message_String_Set)
};
typedef struct _Edje_Message_String_Set EdjeMessageStringSet;
The 'int' type is always 4 bytes on Windows, irrespective of architecture. And I think it's generally the same case on Linux.
IIRC, on Linux, it's usually the size of the 'long int' that varies with architecture - but 'int' usually stays at 4 bytes.
On 64 bit Windows, I'm finding that if char * str[]; is removed from the struct, then struct size is 4 bytes.
If char * str[]; is included, then the struct size is 8 bytes.
So it seems that "str" is increasing the size of the struct by 4 bytes. But that doesn't seem right to me.
Here's the demo I used:
use strict;
use warnings;
use Inline C => Config =>
BUILD_NOISY => 1,
USING => 'ParseRegExp',
;
use Inline C =><<'EOC';
typedef struct _foo
{
int count;
} foo;
typedef struct _bar
{
int count;
char *str[];
} bar;
void sizes(void) {
printf( "FOO: %d %d\n", sizeof(struct _foo),
sizeof( foo) );
printf( "BAR: %d %d\n", sizeof(struct _bar),
sizeof( bar) );
printf( "INTSIZE: %d\n", sizeof(int) );
}
EOC
sizes();
__END__
On 64 bit windows, outputs:
FOO: 4 4
BAR: 8 8
INTSIZE: 4
Maybe a bug in xsubpp ? Nope - it's just something that 64-bit gcc does on both Windows and Ubuntu. If it's a bug, then it's a gcc bug.
On 32-bit windows, it seems that char *str[]; does indeed make zero contribution to the size of the struct, and the same script outputs:
FOO: 4 4
BAR: 4 4
INTSIZE: 4
Cheers, Rob
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
I was surprised to learn that size of int is 32 bits even on a 64 bit compiler. But that is true. To get 64 bit int, even long long is required on some compilers.
What I found out is that default for the 64 bit gcc compiler is to enable padding to maintain dword memory alignment in structures. This is what is causing the seemingly weird results with your sizeof() test code. sizeof() can return more than the obvious number of bytes due to padding. See Data Alignment in Structs for some more info. There is a way in gcc to override this behavior if perhaps you need to match some weird binary structure exactly verbatim. Just like malloc(), struct address assignment "likes" dword (64 bit alignment).
typedef struct _foo
{
int count; //with just int, sizeof() is 4 bytes
char anything; //forces sizeof() increase from 4 to 8 bytes!
} foo;
Evidently, even putting something in the struct that has no storage, like char* pointer[], forces alignment bytes to be added. The mysterious extra 4 bytes aren't anything, they are just junk padding bytes. The pointer(s) when added will be 8 bytes and it is highly desirable for these to be fully contained on a single memory row. If they are on 32 bit boundaries, the hardware can still read them, but at a performance penalty.
So my code as written performs correctly, however the explanation of why the struct is 8 bytes is not correct. 4 bytes are for the integer (not 8 as I wrongly assumed) but then an additional 4 bytes are added as padding. 4+4=8. So the entire array of pointer is 64 bit aligned (8 byte boundary). | [reply] [Watch: Dir/Any] [d/l] |
Re^7: Perl XS binding to a struct with an array of chars*
by GrandFather (Saint) on Nov 24, 2022 at 21:06 UTC
|
I thought I'd have a crack at demoing how char *[] might hang together. So first I downloaded your code, then realised I didn't have Inline or Inline::C installed, so I set about doing that using cpanm. The Inline install was quick and trouble free. The Inline::C install however has reached:
...
Building and testing Tie-IxHash-1.23 ... OK
Successfully installed Tie-IxHash-1.23
Building and testing Pegex-0.75 ... OK
Successfully installed Pegex-0.75
--> Working on Win32::Mutex
Fetching http://www.cpan.org/authors/id/C/CJ/CJM/Win32-IPC-1.11.tar.gz
+ ... OK
Configuring Win32-IPC-1.11 ... OK
Building and testing Win32-IPC-1.11 ... OK
Successfully installed Win32-IPC-1.11
Building and testing Inline-C-0.82 ...
and has been sitting there for the last half hour with no apparent progress. This is on Windows 10 and Strawberry Perl 5.32.1:
Summary of my perl5 (revision 5 version 32 subversion 1) configuration
+:
Platform:
osname=MSWin32
osvers=10.0.19042.746
archname=MSWin32-x86-multi-thread-64int
uname='Win32 strawberry-perl 5.32.1.1 #1 Sun Jan 24 12:17:47 2021
+i386'
Ideas?
Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
Ideas?
It can take a few minutes to run through the test suite, but 30 minutes is a bit extreme.
I see you've got a 32-bit build of Strawberry Perl. That's a little uncommon these days, but should not be an issue.
I've just installed Inline-C-0.82 on a 32-bit SP-5.32.1.1 using cpanm, and it all went fine.
Try turning on verbosity (cpanm -vi Inline::C). That should at least show us where the hang is happening.
Cheers, Rob
| [reply] [Watch: Dir/Any] [d/l] |
|
Hi Rob, I have had endless trouble installing "interesting" modules on my work machine. No Idea why, but I've found it impossible to install Tk against any version of Strawberry Perl I care to try and a multitude of other modules (which I don't remember) have given grief. I installed Win32::IPC and rerun the Inline::C install. It at least completes now, but fails most of its tests. So I installed it with force. Your sample code then at least starts running, but fails with:
"C:\STRAWB~2\perl\bin\perl.exe" -MExtUtils::Command -e mv -- noname_pl
+_cf1b.xsc noname_pl_cf1b.c
gcc -c -iquote"D:/Scratch~~/PerlScratch" -DWIN32 -D__USE_MINGW_ANSI_S
+TDIO -DPERL_TEXTMODE_SCRIPTS -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_
+SYS -DUSE_PERLIO -fwrapv -fno-strict-aliasing -mms-bitfields -s -O2
+ -DVERSION=\"0.00\" -DXS_VERSION=\"0.00\" "-IC:\STRAWB~2\perl\lib\CO
+RE" noname_pl_cf1b.c
cc1.exe: error: unrecognized command line option "-iquoteD:/Scratch~~/
+PerlScratch"
gmake: *** [Makefile:333: noname_pl_cf1b.o] Error 1
A problem was encountered while attempting to compile and install your
+ Inline
C code. The command that failed was:
"gmake" with error code 2
...
I'm not going to pursue this at work, but I may pick it up again at home.
Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond
| [reply] [Watch: Dir/Any] [d/l] |
|
|
| [reply] [Watch: Dir/Any] |
|
The main CPAN Inline::C page lists ... as dependencies
There's a module missing from that list. Win32::Mutex, which is part of the Win32::IPC distro, is also a dependency (on Windows only, of course).
This bug in Inline::C is not normally an issue on Strawberry Perl because that dependency usually gets picked up anyway.
But you can't be certain if/when/how it will manifest itself and, if it bites, then you just have to cpanm -i Win32::IPC.
Cheers, Rob
| [reply] [Watch: Dir/Any] [d/l] |
Re^7: Perl XS binding to a struct with an array of chars*
by Marshall (Canon) on Nov 24, 2022 at 22:24 UTC
|
I have Inline::C working on my system now. Before proceeding further, I decided to make a little test to see what malloc() is doing. Often more memory is allocated than requested and I wanted to see if I could get an idea of "how much more?". Sometimes there is a lucky accident where the code works although it is not guaranteed to work.
I have 2 programs.
1) 32 bit gcc independent of Perl
2) 64 bit gcc that is part of my 64 bit Perl 5.24 installation
Code for both is shown below.
For 32 bit version, I am not surprised to see 64 bit instead of 32 bit alignment (lower 3 address bits always zero). 64 bit alignment also appears to be the case for the 64 bit gcc code as well. Anyway the alignment drives the absolute minimum memory allocation unit. In both cases, 8 bytes (64 bit alignment). The question is then whether there is additional space due to the quanta that malloc() uses to track allocation?
The 32 bit standalone version does the exact same thing every run. I don't know why there is the difference between 2nd and 1st allocation, but after that we see the pattern of 16 bytes given for a single byte requested.
The 64 bit inline C does something different every run! For all I know this could be some kind of Perl security feature?
Anyway, I thought the results relevant to our discussion about allocation of this weird type.
#include <stdio.h>
#include <stdlib.h>
//testing malloc() this is 32 bit gcc - separate from Perl
void testMalloc(void)
{
char* x = (char *) malloc(1);
printf (" Byte Starting Memory Addr is %p\n",x);
char* y = (char *) malloc(1);
printf ("Next Byte Starting Memory Addr is %p\n",y);
printf ("difference in bytes between byte2 and byte1 = %d\n",y-x);
char* z = (char *) malloc(1);
printf ("Next Byte Starting Memory Addr is %p\n",z);
printf ("difference in bytes between byte3 and byte2 = %d\n",z-y);
char* alpha = (char *) malloc(1);
printf ("Next Byte Starting Memory Addr is %p\n",alpha);
printf ("difference in bytes between byte4 and byte3 = %d\n",z-y);
free(x);
free(y);
free(z);
free(alpha);
return;
}
int main(int argc, char *argv[])
{
testMalloc();
exit(0);
}
/*
Byte Starting Memory Addr is 00C92FD8
Next Byte Starting Memory Addr is 00C90CC8
difference in bytes between byte2 and byte1 = -8976
Next Byte Starting Memory Addr is 00C90CD8
difference in bytes between byte3 and byte2 = 16
Next Byte Starting Memory Addr is 00C90CE8
difference in bytes between byte4 and byte3 = 16
*/
/*
Byte Starting Memory Addr is 00B22FD8
Next Byte Starting Memory Addr is 00B20CC8
difference in bytes between byte2 and byte1 = -8976
Next Byte Starting Memory Addr is 00B20CD8
difference in bytes between byte3 and byte2 = 16
Next Byte Starting Memory Addr is 00B20CE8
difference in bytes between byte4 and byte3 = 16
*/
#################################################
# testing malloc 64 bit Perl
use strict;
use warnings;
use Inline C => Config =>
BUILD_NOISY => 1,
CLEAN_AFTER_BUILD => 0,
USING => 'ParseRegExp',
;
use Inline "C";
testMalloc();
=OUTPUT:
Byte Starting Memory Addr is 000000000308CE68
Next Byte Starting Memory Addr is 000000000308D258
difference in bytes between byte2 and byte1 = 1008
Next Byte Starting Memory Addr is 000000000308D018
difference in bytes between byte3 and byte2 = -576
Next Byte Starting Memory Addr is 000000000308D0A8
difference in bytes between byte4 and byte3 = -576
=cut
=AnotherRun:
Byte Starting Memory Addr is 00000000031157D8
Next Byte Starting Memory Addr is 0000000003115B98
difference in bytes between byte2 and byte1 = 960
Next Byte Starting Memory Addr is 0000000003115E98
difference in bytes between byte3 and byte2 = 768
Next Byte Starting Memory Addr is 0000000003115F88
difference in bytes between byte4 and byte3 = 768
=cut
=yet Another Run
Byte Starting Memory Addr is 0000000002EA6CF8
Next Byte Starting Memory Addr is 0000000002EA6EA8
difference in bytes between byte2 and byte1 = 432
Next Byte Starting Memory Addr is 0000000002EA7AA8
difference in bytes between byte3 and byte2 = 3072
Next Byte Starting Memory Addr is 0000000002EA7C58
difference in bytes between byte4 and byte3 = 3072
=cut
=one more time
Byte Starting Memory Addr is 000000000313CE38
Next Byte Starting Memory Addr is 000000000313C5F8
difference in bytes between byte2 and byte1 = -2112
Next Byte Starting Memory Addr is 000000000313C868
difference in bytes between byte3 and byte2 = 624
Next Byte Starting Memory Addr is 000000000313CA48
difference in bytes between byte4 and byte3 = 624
=cut
__END__
__C__
void testMalloc(void)
{
char* x = (char *) malloc(1);
printf (" Byte Starting Memory Addr is %p\n",x);
char* y = (char *) malloc(1);
printf ("Next Byte Starting Memory Addr is %p\n",y);
printf ("difference in bytes between byte2 and byte1 = %d\n",y-x);
char* z = (char *) malloc(1);
printf ("Next Byte Starting Memory Addr is %p\n",z);
printf ("difference in bytes between byte3 and byte2 = %d\n",z-y);
char* alpha = (char *) malloc(1);
printf ("Next Byte Starting Memory Addr is %p\n",alpha);
printf ("difference in bytes between byte4 and byte3 = %d\n",z-y);
free(x);
free(y);
free(z);
free(alpha);
return;
}
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
I decided to make a little test to see what malloc() is doing
I pretty much always use "Newx" or "Newxz" instead of "malloc" because I read somewhere that they are the recommended XS way of allocating memory.
However, I've always found the 2 alternatives to be interchangeable.
Maybe there are some systems and/or perl configurations where they cannot be used interchangeably but, to my knowledge, I've not encountered such a case.
IME, doing Newx(x, 42, datatype) is effectively the same as doing x=malloc(42 * sizeof(datatype)).
The important difference is that the former allocation must be released by "Safefree", whereas the latter must be released by "free".
Therein lies the sum of my knowledge of memory allocation ;-)
Cheers, Rob
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
| [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] |
|
|
|
|
|