in reply to C types and SV's and unpack

Personally, I would keep just the "dumb memory" in C and use pack/unpack to fetch the data from it using Perl. But that may well be because I feel more confident with Perl than I feel with C. Also potentially see Tie::Array::PackedC for inspiration how to make your buffer appear as an array of integers from the outside.

Replies are listed 'Best First'.
Re^2: C types and SV's
by blakew (Monk) on Jul 18, 2010 at 18:57 UTC
    Thanks Corion. If you don't mind could you show me an example of pack/unpacking a 36-byte char* buffer into an array of integers, and/or: How would I move the buffer from C to Perl? (I'm aware there's documentation, which I will read, but I do much better with examples.)

      First you make the C data buffer available as a PV to Perl, I think via newSvPV. Depending on the documentation on who owns the memory, you don't want the memory to be freed by Perl but by your library, see the documentation on how to tell Perl to not free your PV.

      Then, you can use unpack to get at the raw numbers. This will likely use the same semantics as your C compiler, which should in this case be "good enough". Beware that this might break if you transport data between machines with differing endianness or word size:

      my $bps = 32; my $fmt = 3; my %template = ( "32,3" => 'd',# A double-precision float in native format. "32,1" => 'I', "32,2, => 'i', # ... add the other formats as you need ); my $rawdata = get_raw_data_from_C_library(); my $t = $template{ "$bps,$fmt" } or die "Couldn't find an unpack template for $bps BPS, format $fmt +"; my @samples = unpack $template, $rawdata;
        Corion, you are the best. I think I've got it-
        #!/usr/bin/perl use strict; use warnings; use Inline 'C'; Inline->init; my $buffer_size = 36; # (6x6) print "Buffer size: $buffer_size\n"; my %template = ( "32,3" => 'f', "32,1" => 'I', "32,2" => 'i', "8,1" => 'c', "8,2" => 'C', # ... add the other formats as you need ); my $data; print "char* data:\n"; $data = get_char_buffer(); printf "Raw data %s\n", $data; print_buf( $data, 8, 1 ); # unsigned ints print "\nfloat* data:\n"; print_buf( get_float_buffer(), 32, 3 ); # floating point sub print_buf { my ( $rawdata, $bps, $fmt ) = @_; print "Ref count: " . get_ref_count( $rawdata ) . "\n"; my $t = $template{ "$bps,$fmt" } or die "Couldn't find an unpack template for $bps BPS, format +$fmt"; print "Template: $t\n"; my @samples = unpack( "${t}${buffer_size}", $rawdata ); print "Samples:\n@samples\n"; } 1; __DATA__ __C__ #define BUF_SIZE 36 SV* get_char_buffer() { void* buffer; SV* sv_buf; int i; buffer = malloc(sizeof(char) * BUF_SIZE); for (i = 65; i < 65 + BUF_SIZE; i++) { // printable chars ((char*)buffer)[i-65] = (char)i; } sv_buf = newSVpv(buffer,BUF_SIZE); free(buffer); return sv_buf; } SV* get_float_buffer() { void* buffer; SV* sv_buf; int i; buffer = malloc(sizeof(float) * BUF_SIZE); for (i = 1; i <= BUF_SIZE; i++) { ((float*)buffer)[i-1] = (float)(i + 0.5); } sv_buf = newSVpv(buffer,32*BUF_SIZE); free(buffer); return sv_buf; } int get_ref_count(SV* buf) { return (int)(SvREFCNT(buf)); }
Re^2: C types and SV's
by DrHyde (Prior) on Jul 19, 2010 at 09:47 UTC

    The problem here is that, in the general case (which the OP may not care about, I'll admit) using plain ol' pack/unpack isn't very portable, as you have to tell it how the structure is laid out in memory. And C makes very few guarantees about that. Merely by looking at the data you can't tell, for example, whether you have a 16-, 32- or 64-bit value; you can't tell whether a word is signed or unsigned; you can't tell whether a structure has been padded with empty space so that its members are on word boundaries; and so on.

    I'd take a look at Convert::Binary::C, and in particular the ccconfig script that comes with it.

      I'm aware of that, so my translation was mostly what the C code said. The C code is not portable for exactly the reason you cite, because the word size may vary (with the advent of 64-bit architectures) again. If I were to really implement a tiff reader, I would look at the documentation again, or at existing implementations, to make sure that reading the data reads the number of bytes and treats them correctly. I'm sure that for TIFF, there is a definition of endianness, and then one could switch to the Nn or Vv unpack templates instead of using the architecture dependent Ii templates.

        TIFF can be big- or little-endian. There's a marker in the file to tell you which one it is.