The stupid question is the question not asked PerlMonks

### japhy looks at av.c (not av.h)

by japhy (Canon)
 on Nov 20, 2000 at 00:46 UTC ( #42437=note: print w/replies, xml ) Need Help??

in reply to Sort this data

### UPDATE

As per my revelations in sort this data, here is a bit of an adjusted report on the av.c source.
Here's the relevant portion from av.c in the 5.6 source:
```/* this is Perl_av_unshift()
it unshifts 'num' undef values to an array */

/* determine how much non-used spaced is left
that's been allocated for this array */
i = AvARRAY(av) - AvALLOC(av);

/* if there's room left... */
if (i) {
/* if there's more room than we need, just use 'num' */
if (i > num) i = num;

/* this will set 'num' to 0 if we had enough room */
/* 'num' is now how many new undef values we need added */
num -= i;

AvMAX(av) += i;  /* set the highest subscript??? */
AvFILLp(av) += i;  /* add to highest subscript */
SvPVX(av) = (char*)(AvARRAY(av) - i);  /* where Perl's array starts
+*/
}

/* if there wasn't enough room already... */
if (num) {
i = AvFILLp(av);  /* highest subscript */
av_extend(av, i + num);  /* extend array to i+num elements */
AvFILLp(av) += num;  /* add to highest subscript */
ary = AvARRAY(av); /* get at the array */
Move(ary, ary + num, i + 1, SV*);  /* slide elements up */
do {
ary[--num] = &PL_sv_undef;  /* set new elements to undef */
} while (num);
}
Ok, so unshift() has to do a lot of sliding the elements up if there's not already some free space generated by a call to shift().

What shift() does is simply slide the pointer up one notch, which is very fast (which provides free space for unshift() later). It stores the value in *AvARRAY(av) into retval, and then sets *AvARRAY(av) to the C equivalent of Perl's undef value. Then it moves the pointer that marks the beginning of the array to AvARRAY(av) + 1. It also subtracts one from the max subscript and the length of the array. Then it returns retval.

<revelation>I'm beginning to grok the source code! Look out!</revelation>.

japhy -- Perl and Regex Hacker

Replies are listed 'Best First'.
Re (tilly) 1: japhy looks at av.c (not av.h)
by tilly (Archbishop) on Nov 20, 2000 at 02:12 UTC
Nice!

For those who are not following the code, the logic here is based on first trying to allocate elements which there is room for (this is the "if (i)" bit). If it cannot get them all in it then makes sure it has enough space, does some accounting, copies everything over, then inserts some new stuff. Note that japhy's comment about unused space is misleading, he means unused at the beginning of the array. There is also unused stuff at the other end, but we cannot directly use that.

For full details you have to also look at av.h. The call to AvMAX is just a macro to set xav_max which is the largest element which there is space for (adding to the front increases that), and AvFILL is setting xav_fill which is how many you have (adding obviously increases that as well).

The call to av_extend is where the array size increases. What it does is extends the definition of what the array is occupying. In fact it is in a buffer whose size is allocated in powers of two. If the extend takes the array beyond the size of the buffer, then a new buffer is allocated and the array is moved. If it does not then the array is left where it is.

Now it is clear how to make repeated calls to unshift fairly efficient. Right now it moves stuff by the minimum necessary. What we need is to have a new variable for how much to move it. That variable should be the maximum of num and the length of the array. This will cause space wastage, but it will also mean that when you hit an unshift and it has to move the array, it will not hit that copying logic again right away.

Even so building up an array using n calls to push will still be faster than unshift because there is less accounting to do. But both will be order n rather than having n calls to unshift being order n^2 as it is today.

Re: japhy looks at av.c (not av.h)
by extremely (Priest) on Nov 20, 2000 at 02:13 UTC
You know, from looking at the code elsewhere (look for av_extend) they have special code in there to stretch out the array for preextending. Maybe we need a slight syntax change. Maybe \$#=-500; should preallocate space at the beginning of the array without changing the size of it. It is shame the 6.0 process is frozen or I would hack up a quick RFC.

--
\$you = new YOU;
honk() if \$you->love(perl)

Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://42437]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (5)
As of 2023-06-06 11:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
How often do you go to conferences?

Results (26 votes). Check out past polls.

Notices?