Wow. (Update: I guess you are thinking of C arrays not Perl arrays, or at least an implementation much more like a naive, dynamic C array than what Perl actually uses.)
I'll agree with pop and shift being O(1) and not just in an average case. They are always O(1). (I won't argue with whether O() notation implies worst case or average case or whatever, mostly because I don't care.)
I'll agree with splice being O(N). Splicing to remove from the middle is always O(N/2). Splicing at position J elements from either end if often O(J). But splicing to insert K elements can be O(N+K). So averaging out to O(N) isn't a bad summary. Continuing to ignore some predicted objections to misuse of O(), I'd say splicing to remove is O(N/2) while splicing to insert is O(N+K). :)
If you push N times, then that is O(2*N), O(N) for the realloc() calls that likely end up copying the elements each time the array size doubles and O(N) for the N items being pushed. So push is O(2) on average. And using push on an array that both shrinks and grows is closer to O(1) than O(2) (yes, I'm still ignoring it).
If you unshift N times, then that is a bit more complex. I don't have swapped in the exact algorithm that is used for "leave some space at the front in case they unshift again", but I think it is also exponential (or is that "geometric"?) so unshift probably averages to O(3).
See Shift, Pop, Unshift and Push with Impunity! for another take on this.
| [reply] |
Double check. Last I checked, unshift was O(1) average case as well. | [reply] |
I'm sure you mean that push is O(1). shift and unshift can be made O(1) very easily by the space padding provided at the beginning of an array and they have 1 O(N) operation every M entries (where M, I think, is in the range of 50).
Splice is O(N) only if you require that the Perl array be optimized for random access. If you're willing to allow for disjoint arrays and do some calculations that can be a bit fiddly, then you can make splice O(1) and random access O(1) (but potentially slower than it is right now).
My criteria for good software:
- Does it work?
- Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
| [reply] |
Splice is O(N) only if you require that the Perl array be optimized for random access.
Indeed, you could even implement Perl arrays as linked lists and then...
But that is, of course, quite beside the point of what a "Perl array" is (not what it might one day be in some unlikely future of your imagination). What a "Perl array" is not, is a linked list. And splice is not O(1) for a Perl array, no matter how strenuously you object to reality. :)
| [reply] |
I'll concede that splice, as currently implemented, is O(N) (much as I hate to say that).
That said, splice, shift, and unshift could all become O(1) if the following were to happen:
- Whenever a shift happens, that space is kept at the front of the array and the internal starting point gets moved. (This already happens, I believe).
- Whenever a mutation is made that would force a reshuffle, instead a pointer is made to another area (kind of like an arrayref). This would mean that bookkeeping would have to be instituted so that the memory location can be found in the case of a random access. I haven't written this bookkeeping code, but my gut says that it's not too squirrelly. This would work for both unshift and splice.
- In the case of a normal interaction with the array, no shuffling would occur, so this would just entail a single extra check into the bookkeeping data structure to determine that there is no special bookkeeping.
I suppose I could write up a Tie::Array that would do this. In thinking about it, this may sound like a proper solution for DBM::Deep's arrays. What do you think?
My criteria for good software:
- Does it work?
- Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
| [reply] |