TheDauthi has asked for the wisdom of the Perl Monks concerning the following question:

I recently received a poorly documented, poorly written DLL that I'm attempting to interface with perl. (Activestate, win32, 5.8.7). Obviously, there's a problem, the first one being that I don't understand XS very well yet... but I'm working on that. The gist of my question is if there's a good way to avoid an extra copy because there's a LOT of data. If there's no way to do so, I'll just use the straight-forward way.

Basically, one of the C routines will be looped over, and will easily return several thousand different 83-byte char buffers (IE, I pass in an 83-byte buffer, and it pushes data into it... several thousand times)

If I understand newSVpv correctly, that copies the message buffer into the new SV, which is an extra copy that I shouldn't need. What I'd like to do is something like
AV* GetTable() PREINIT: AV *array; int max_rows, max_cols, row, col; char *message; SV *message_sv; CODE: max_rows = 10; max_cols = 5; array = (AV*) sv_2mortal((SV*) newAV()); for( row = 0; row < max_rows; row++) { for( col = 0; col < max_cols; col++) { message_sv = newSV(83); message = SvPV_nolen(message_sv); ExpensiveGetString(row, col, message, 83); av_push(array, message_sv); } } RETVAL = array; OUTPUT: RETVAL
Where ExpensiveGetString is approximately equal to
strcpy(message, "12345678901234567890123456789012345678901234567890123 +456789012345678901234567890123456789012345678901234567890123456789012 +\0");
which is exactly what I'm using to test it. I get an array of undefs back, which tells me that I'm missing something.

Why doesn't this work? Is there a better way to avoid copying the data twice (some of the datasets are in the 30k rows and 70 columns range)?

Replies are listed 'Best First'.
Re: XS efficient copy
by Anonymous Monk on Apr 06, 2006 at 19:54 UTC
    try this instead:
    ExpensiveGetString(row, col, message, 83); av_push(array, newSVpvn(message, 83));

      That's a copy, which the OP doesn't want.

      I think the problem in the original code is the use of SvPV_nolen instead of SvPVX. If XS is doing the right thing, it's returning a (new) buffer to an empty string, because the SV is still undefined and while its string has been allocated, it hasn't been initialized. This is the usual formula for creating a buffer, then passing the buffer to a C routine:

      message_sv = newSV(83); SvPOK_on(message_sv); SvCUR_set(message_sv, 82); message = SvPVX(message_sv); ExpensiveGetString(row, col, message, 83); av_push(array, message_sv);
      Ordinarily, one would null terminate with *SvEND(message_sv) = "\0", but I'm assuming that the null byte is the 83rd byte, so there's no need.
      --
      Marvin Humphrey
      Rectangular Research ― http://www.rectangular.com
        Thanks, your code works, and helped me figure it out.
        I had two problems.
        SvPOK_on must be called at some point, otherwise, even though it contains data, perl doesn't 'know' that it contains data. Also, I wasn't setting the SvCUR_set, which makes perl think I had an empty string.
        I went through several variations on this theme, but all of them were missing that key piece... which makes perfect sense after reading perlguts a bit more carefully.

        Actually, SvPV_nolen() is safer than SvPVX(); SvPVX() gives you a pointer to the "PV" (the scalar's string value) even if the scalar doesn't have a string value while SvPV_nolen() will force the scalar to get a string value if it doesn't have one already.

        The problem is that the size of the buffer is specified but the length of the string stored in that buffer is never set. I'd change your XS-code-for-cargo-culting, (:, to:

        int bufsize= 83; SV* svBuf= newSV( bufsize ); char* pBuf= SvPV_nolen( svBuf ); ...( ..., pBuf, bufsize, ... ); SvCUR_set( svBuf, length_of_data_written );

        where "length_of_data_written" might be variable, such as the return value from the function that stuffs characters into pBuf.

        - tye