in reply to Re^2: How to speed up a nested loop?
in thread How to speed up a nested loop?

I'm not 100% sure what you mean by this, but it denotes some kind of bulk operation, right? I'm not too sure that that would be very useful, given that i don't only need to transfer the data, but also need to check whether the input data is different than the already existing data.

I assume that you are doing some kind of billboarding for the backdrops of your scenes. And you are attempting to optimise that by onlu updating the cells or slices if they have changed. But, often as not with these things, is frequently quicker to avoid the conditionals and simple update everything. When the screen is redrawn, because the player moved or rotated his viewpoint, everything needs to be scaled/slewed and redrawn anyway.

So, if you could do @{ $tiles[z][y] } = @tile_data[ $y*16 .. ($y+1) * 16 - 1 ];-- that is, assign a complete X-slice in one go--then you've both avoided 16*2 index calculations, but also 16 condition tests.

If you could arrange for the input data to be pre-formatted as a 2d array, then you can also avoid the slice operation and just assign pointers to arrays of 16 values, with a significant further saving.

I appreciate it might require a re-think on some of your logic to make use of this, and that may be further than you wish to go, but it can be effective.

It's similar to some descriptions of the Fischer-Yates shuffle, that test each pairing and avoid the swap if the values are the same. In most cases, it is cheaper to go ahead and swap them unconditionally, than to test the condition for every pairing.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

Replies are listed 'Best First'.
Re^4: How to speed up a nested loop?
by Xenofur (Monk) on Sep 18, 2008 at 14:49 UTC
    Ah, i see where you're coming from. I need to explain however that that the analysis for change is necessary. You see, in order to actually have things be decently fast in OpenGL, i compile the contents of one 16x16 block into one display list, that can be called as one single line in perl and gets drawn directly from gfx card ram. Without those this would be pretty much impossible.

    However the act of actually constructing those is pretty costly timewise. To give a comparison, the following is from one refresh cycle where ALL display lists (98 in this instance) get generated:
    the code took: 0.467388 wallclock secs ( 0.47 usr +  0.06 sys =  0.53 CPU)
    And this one is from when NO display lists are generated:
    the code took: 0.025486 wallclock secs ( 0.03 usr +  0.00 sys =  0.03 CPU)
    As such, making the assignments would shave time off of an "idle refresh", but greatly increase the general costs by increasing the number of "rebuild refreshes".

    The increases i'm looking for right now are necessary to ensure the ability to refresh at a constant rate while no game changes are occuring, without having any skipping in the operation. This goal is pretty much reached, as i can now fix the FPS at 30 and updates should fit inbetween frame generation.