Thank you very much to both dk and bart. I guess this leaves me free to choose my own matrix orientation. When I'm done with this new subclass I plan to share it with the monks for feedback. Thanks again! | [reply] |
The important thing is which dimension will you be iterating with the greatest frequency?
PDL applies operations to whole piddles at a time. If you arrange your matrix so that the longest dimension is its first (leftmost) dimension, then the loop code at the assembler level is likely to be able to use auto-incrementing addressing modes, because the fastest changing dimension will be consecutive addresses. This will be faster, often much faster than if it has to add a large offset to calculate the next address each time. It will also likely benefit from greater cache coherency which again can have a significant impact upon performance.
So if all else is equal to your algorithm, make the first (leftmost) dimension of your piddle the largest. See also the documentation of PDL implicit and explicit 'threading'. It goes right over my head, but seems to be applicable to this discussion.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] |
I'm not sure whether or not I will even be using any threading in the work I'll be doing. I'm mainly using PDL just so I can use matrix multiplication to make my equations simpler. However, I will keep your suggestion in mind, since I may end up using threading if possible to increase speed.
| [reply] |