One thought. Does your current profiling show that you are spending an excessive amount of time in each call to the XS sub? Or is the time accumulated over many, many relatively quick calls?

The point being that to make really effective use of XS routines for speeding things up, each call to the XS routine should do as much as possible. This because there is a significant overhead involved in the transfers into and out of an XS subroutine.

For example, PDL does its things very efficiently, and if you have a million numbers to be processed and pass them to PDL as a single lump, the speed up over a pure Perl equivalent is enormous. But, if your requirements mean that you have to process those numbers as 100,000 sets of 10, then your gains will be far less dramatic--maybe even negative. This because the time saved processing just 10 numbers in C, is outweighted by overhead of the transfer into XS and back.

A second thought. When it comes to profiling C code, calls to system (clock etc.) routines can also have a significant overhead. And if you combine each call to get the time with another to printf the results somewhere, the overhead can completely distort the code-under-test.

If you are using a Intel processor, there is a single instruction rdtsc which gets you a high resolution timestamp counter with very little overhead. If your compiler allows in-line assembler--or has a _rdtsc() intrinsic--then this can be combined with a small piece of C--as a macro--that fetches the counter and writes it to a buffer along with a line number (from __LINE__) very efficiently. That allows you to tag various points in your C code and gather some stats without overly distorting the timing.

I don't have an example to hand. The last time I did this was on my old machine, so the source is archived off on a CD somewhere. But it shouldn't be too hard to recreate it if it would be useful?


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
RIP PCW It is as I've been saying!(Audio until 20090817)

In reply to Re: profiling XS routines by BrowserUk
in thread profiling XS routines by jpl

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.