I'm writing Inline::C wrappers over some code that reads and writes LZW compressed data (/usr/bin/compress and /usr/bin/uncompress). That code is already very speedy and copies the data one character at a time into a buffer. I'm having perl manage the buffer as a plain string and I'm growing it whenever the pointer reaches the end.

I'm not experienced with C. Tell me if I'm doing something wrong.

Is there a good value for SZSIZE? I just picked 8192 out of the air because it seemed reasonable. I'd like to avoid fragmenting memory too much. I'd also like to prevent unnecessary copying. Are there any good strategies for serving these goals? Is there some magic value associated with modern (or not so modern) perl that I'd want to follow?

#define SZSIZE 8192 SV* LZW_zread( ... ) { SV* retval; u_char bp, end_bp; u_int offset; ... retval = NEWSV( ..., SZSIZE ); SvPOK_only( retval ); bp = (u_char*)SvPVX( retval ); end_bp = SZSIZE + bp; ... while ( ... ) { if ( bp == end_bp ) { /* When bp is == end_bp then the buffer must be grown. bp is cur +rently just off the end of the buffer and any operations on it is un +safe. */ offset = bp - PvPVX(retval); sv_grow(sv, offset + SZSIZE); bp = PvPVX(retval) + offset; end_bp = SZSIZE + bp; } ... *bp++ = ...; } ... SvCUR_set(sv, bp - SvPVX(retval)); return retval; }

In reply to Growing strings, avoiding copying and fragmentation? by diotalevi

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.