in reply to [OT] Swapping buffers in place.

Well, my intuition tells me that it can be done with just a single temporary register, but the pointer arithmetic you'd have to do is pretty arcane. (I think having just two temp registers instead of one won't make it any easier.)

I think the idea expressed in the first reply, involving enough temp storage to hold the difference in size between the two parts, will have the smallest footprint that can be implemented by simply incrementing (or decrementing) pointers.

But if you really want to pursue a single-temp-register solution, the basic idea for the pointer arithmetic would go like this:

  1. Pick a starting point in the buffer and move its value into the temp register - for simplicity, let this initial "move-to" pointer (p1) be offset 0 from the start of the buffer.
  2. What do you add to this address in order to point to the element that should be moved here? Let's call it D; set a "move-from" pointer (p2) equal to p1 + D.
  3. Move the value from p2 to p1.
  4. Now set p1 = p2 ("move-from" becomes "move-to") and add D to p2 (setting the next "move-from" value) and go back to step 3 - but here's the first arcane part: whenever this addition puts the pointer value past the end of the buffer, subtract the length of the buffer (i.e. allow the pointer to just wrap around).
  5. The next arcane part: if D is an even number, then at some point you'll reach an iteration where p2 is set to the initial value of p1 (where the initial move was from p1 into the temp register), so move from the temp into this p1 location, then increment both pointers by 1, move the value from the new p1 into the temp register, and return to step 3.
  6. Last arcane part: keep count of every move, including moves out of (not into) the temp register; when you've done buffer_count moves, you're done. (When D is odd, this will be the first iteration where p2 turns up to have the initial value of p1, so you'll move from temp on this iteration.)

I hope I've got that right (I suspect there may be a better way to describe it)... If it all sounds like a better idea than using more storage, then have fun with that.

(Update: Regarding the "completion condition" (item 6), there is a better (more correct) description: it will always happen on an iteration where a value is moved out of the temp register. When D is odd, this happens only once, and when it's even, it happens exactly twice - once for the even-numbered offsets, and once for the odd-numbered ones.)

(Another update -- sorry... The problem in steps 5 and 6 is not just an even/odd thing. It has to do with the largest common divisor of D and buffer_count; e.g. if D and buffer_count are both multiples of 4, then you'll need to move values out of the temp register a total of four times; you'll need to add 1 to the pointers after the first three of those, and the fourth one will be the final iteration.)

Can't... stop... updating... I improved the phrasing in step 3 (I hope it's clearer), and here is a diagram of the sequence (using a small buffer).

The sequence of steps is laid out from top to bottom; each step involving a movement of data is numbered (including moves into the temp register "TR", so there are 12 steps, because two such moves are needed in this example). The buffer slots are laid out horizontally, with blow-by-blow comments down the right-hand side.

So, in addition to the temp register, you need to store the memory address from which a value was last moved into TR; when the "move-from" pointer (P2) is set to that address, you have to move the value from TR, instead of from P2, then increment P1 and P2, move from P1 to TR, and then resume.

TR 0 1 2 3 4 5 6 7 8 9 [x x x x][y y y y y y] 0. _/ A B C D E F G H I J *P1 *P2 set pointers, bgn=0 1. A/ A B C D E F G H I J *P1 -> TR 2. A/ E B C D E F G H I J *P2 -> *P1 *P1 *P2 shift pointers 3. A/ E B C D I F G H I J *P2 -> *P1 *P2 *P1 shift pointers 4. A/ E B C D I F G H C J *P2 -> *P1 *P1 *P2 shift pointers 5. A/ E B G D I F G H C J *P2 -> *P1 *P2 *P1 shift pointers (now P2 points to "bgn" location (0), so fetch from TR:) 6. A/ E B G D I F A H C J TR -> *P1 *P2 *P1 increment pointers, bgn=7 7. H/ E B G D I F A H C J *P1 -> TR 8. H/ E B G D I F A B C J *P2 -> *P1 *P1 *P2 shift pointers 9. H/ E F G D I F A B C J *P2 -> *P1 *P1 *P2 shift pointers 10. H/ E F G D I J A B C J *P2 -> *P1 *P2 *P1 shift pointers 11, H/ E F G D I J A B C D *P2 -> *P1 *P1 *P2 (now P2 points to "bgn" location (7), so fetch from TR:) 12. H/ E F G H I J A B C D TR -> *P1 (last iteration) [y y y y y y][x x x x]