Re: XS Prepending space to an SVs PV
by creamygoodness (Curate) on Apr 26, 2006 at 12:33 UTC
|
The key concept that salva exploits in his excellent example is the "offset OK hack". When you call substr in Perl and lop a few characters off the front of a scalar... instead of reallocating and copying, Perl performs the following steps:
- Store the number of bytes to get lopped in the scalar's IV.
- Move the SvPVX pointer forwards.
- Set the scalar's OOK flag.
- Turn off the scalar's IOK flag.
- Adjust the scalar's CUR and LEN to reflect the change.
$ perl -MDevel::Peek -e \
'my $toes = "potatoes"; substr($toes, 0, 4, ""); Dump($toes);'
SV = PVIV(0x1801a20) at 0x1801434
REFCNT = 1
FLAGS = (PADBUSY,PADMY,POK,OOK,pPOK)
IV = 4 (OFFSET)
PV = 0x300b64 ( "pota" . ) "toes"\0
CUR = 4
LEN = 5
See the section "Offsets" in perlguts for a thorough explanation, as well as the sv_chop function in perlapi. Perl_sv_chop in the sv.c source code is only a few lines long, so you might also want to snoop that.
| [reply] [d/l] [select] |
Re: XS Prepending space to an SVs PV
by salva (Canon) on Apr 26, 2006 at 11:39 UTC
|
static char *
my_sv_unchop(pTHX_ SV *sv, STRLEN size) {
STRLEN len;
char *pv = SvPV(sv, len);
IV off = SvOOK(sv) ? SvIVX(sv) : 0;
if (!size)
return pv;
if (off >= size) {
SvLEN_set(sv, SvLEN(sv) + size);
SvCUR_set(sv, len + size);
SvPV_set(sv, pv - size);
if (off == size)
SvFLAGS(sv) &= ~SVf_OOK;
else
SvIV_set(sv, off - size);
}
else if (len + size <= off + SvLEN(sv)) {
if (off) {
SvLEN_set(sv, SvLEN(sv) + off);
SvFLAGS(sv) &= ~SVf_OOK;
}
SvCUR_set(sv, len + size);
SvPV_set(sv, pv - off);
Move(pv, pv + size - off, len, char);
}
else {
SV *tmp = sv_2mortal(newSV(len + size));
STRLEN tmp_len;
char *tmp_pv;
SvPOK_on(tmp);
tmp_pv = SvPV(tmp, tmp_len);
Move(pv, tmp_pv + size, len, char);
SvCUR_set(tmp, size + len);
sv_setsv(sv, tmp);
}
return SvPVX(sv);
}
this function inserts size bytes in front of the string efficiently and returns a pointer to the start of the new pv.
You would probably want to modify it so that more bytes than requested are reserved on the string to optimize consecutive insertions.
Note that sv_setsv is used to only copy the string contents once as it steals the SV string memory from the source SV when it is a mortal. | [reply] [d/l] [select] |
|
|
modified to accept an additional argument that indicates how much extra space should be allocated when reallocation of the string memory is required:
static char *
my_sv_unchop(pTHX_ SV *sv, STRLEN size, STRLEN reserve) {
STRLEN len;
char *pv = SvPV(sv, len);
IV off = SvOOK(sv) ? SvIVX(sv) : 0;
if (!size)
return pv;
if (off >= size) {
SvLEN_set(sv, SvLEN(sv) + size);
SvCUR_set(sv, len + size);
SvPV_set(sv, pv - size);
if (off == size)
SvFLAGS(sv) &= ~SVf_OOK;
else
SvIV_set(sv, off - size);
}
else {
size += reserve;
if ((size < reserve) || (len + size < size))
Perl_croak(aTHX_ "panic: memory wrap");
if (len + size <= off + SvLEN(sv)) {
SvCUR_set(sv, len + size);
SvPV_set(sv, pv - off);
Move(pv, pv + size - off, len, char);
if (off) {
SvLEN_set(sv, SvLEN(sv) + off );
SvFLAGS(sv) &= ~SVf_OOK;
}
}
else {
SV *tmp = sv_2mortal(newSV(len + size));
char *tmp_pv;
SvPOK_on(tmp);
tmp_pv = SvPV_nolen(tmp);
Move(pv, tmp_pv + size, len, char);
SvCUR_set(tmp, len + size);
sv_setsv(sv, tmp);
}
if (reserve)
sv_chop(sv, SvPVX(sv) + reserve);
}
return SvPVX(sv);
}
| [reply] [d/l] |
|
|
Many thanks (again) for this code, it is very generous of you and so much easier to learn from than the pure reference material of the 'guts and 'api docs.
I set about adapting the above sub to my needs which are slightly less demanding that yours. The naming of sv_chop is (historical; set in stone; not of your making), slightly confusing with respect to Perl's chop as they operate on different ends of the string. That makes sv_unchop even more confusing :)
Anyway, whilst reading the docs on sv_chop, I noticed sv_insert and relating that back to creamygoodness' explanation and demonstration that used substr to prepend to a string, I thought I'd see if I could use that to simplify things a little. What I came up with is this:
if( *a == 9 ) { ## Time to prepend another byte
if( SvOOK( n ) ) { ## If we've some reserve left use it
SvLEN_set( n, SvLEN( n ) +1 );
SvCUR_set( n, ++l );
SvPV_set( n, --a );
}
else { ## else insert 100 more bytes and use sv_chop
## to reserve 99 of them for later
char pad[100] = { 0, }; ## Initialise the reserve to all z
+eros
sv_insert( n, 0, 0, pad, 100 );
sv_chop( n, SvPVX( n ) + 99 );
a = SvPVX( n );
++l;
}
}
...
Whether that could be useful to you I don't know, but if you have the time to cast an eye over it and tell me if you see any obvious fopars... thanks again.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] [select] |
|
|
|
|
I had to do something similar on my Tie::Array::Packed module
I dont mind saying I wish you had sent me a mail about this, I would have been very happy to make Tie::Array::PackedC a pureperl backend to Tie::Array::Packed. (I guess some of the things I did in Tie::Array::PackedC didnt really translate over to XS.)
As it is im overjoyed that you did Tie::Array::Packed. Its pretty much exactly what I would have written if I had taken the time to implement what I wanted in XS, except probably better. :-)
Anyway, I wonder how hard it would be to make a wrapper or something so that yours is used if its available, falling back to mine if its not...
---
$world=~s/war/peace/g
| [reply] |
|
|
I dont mind saying I wish you had sent me a mail about this
yes, I should have done it, my apologies for being so unpolite!
I wonder how hard it would be to make a wrapper or something so that yours is used if its available, falling back to mine if its not...
Quite easy I think, as your package implements a superset of the API provided by mine... I will try to do it.
update: Tie::Array::Packed::Auto
| [reply] |
Re: XS Prepending space to an SVs PV
by vkon (Curate) on Apr 26, 2006 at 11:13 UTC
|
reading perldoc perlguts isn't is what you need:
You can get and set the current length of the string stored in an
+SV
with the following macros:
SvCUR(SV*)
SvCUR_set(SV*, I32 val)
You can also get a pointer to the end of the string stored in the
+SV
with the macro:
SvEND(SV*)
But note that these last three macros are valid only if "SvPOK()"
+is
true.
also, search for SvCUR in perldoc perlguts
And, if all else fails, one can always do RTFS :):):) | [reply] [d/l] [select] |
|
|
malloc'd space [ | | | |s|o|m|e| |d|a|t|a| |h|e|r|e|\0]
^
SV--->PVX---------------|
CUR-------------->|..........................| = 15
LEN------|.....................................| = 19
which is the reverse situation to the normal thing whereby the extra space is at the end rather than the beginning; the rest of the codebase will respect these settings.
I've tried setting it up like this, but I get traps. I need to know whether I am setting it up wrong, or whether the rest of the codebase will always assume that CUR and LEN will start at the same place (ie. The same place pointed at by PVX)?
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] |
|
|
Although Perl will automatically grow strings for you, if you need
+to
force Perl to allocate more memory for your SV, you can use the mac
+ro
SvGROW(SV*, STRLEN newlen)
Then, reading further in perlguts gives all your required manipulations:
Offsets
Perl provides the function "sv_chop" to efficiently remove charact
+ers
from the beginning of a string; you give it an SV and a pointer to
somewhere inside the PV, and it discards everything before the poi
+nter.
The efficiency comes by means of a little hack: instead of actuall
+y
removing the characters, "sv_chop" sets the flag "OOK" (offset OK)
+ to
signal to other functions that the offset hack is in effect, and i
+t puts
the number of bytes chopped off into the IV field of the SV. It th
+en
moves the PV pointer (called "SvPVX") forward that many bytes, and
adjusts "SvCUR" and "SvLEN".
| [reply] [d/l] [select] |