in reply to Re^8: If I am tied to a db and I join a thread, program chrashes
in thread If I am tied to a db and I join a thread, program chrashes

Ah, I didn't see that with the count not subtracting 64. So another bug. Checking time only every 64th round is legitimate though as benchmarks should exclude any overhead of the measurements. As the ithreads version has no disadvantage from that and the Coro measurements are more exact, this is not weighting the die

I don't know what the copying is about. It generally makes sense to unshare data...

Sorry, my english has bugs as well. I think you misunderstood what I was saying here. I hope the sentence is more understandable now with the added word "generally". That his use of it in this benchmark is massively weighting the die is without question

As I was saying in the last post (maybe not clear enough), this script can only work as a Coro benchmark. I'm not arguing that the ithreads side of that code has any merit (I didn't even look at it when I was inspecting the code). But apart from the bug with the time measurement the Coro side of the benchmark seems to be a valid benchmark. And on that side the design decisions of the writer make sense (to me at least). And I suspect that Marc Lehmann first had the (sensible) coro version and then added an ithread version without taking into account that a direct translation to ithreads makes no sense. Whether he did that on purpose, who knows? It is at least incredibly sloppy or stupid if it wasn't on purpose. That he put the benchmark on the net might indicate the former

...means constantly reallocating and copying the underlying shared array...

I thought with "cleaning up the queues" you meant processing the rest of the queue after the last time measurement was done. Now your point makes more sense

  • Comment on Re^9: If I am tied to a db and I join a thread, program chrashes

Replies are listed 'Best First'.
Re^10: If I am tied to a db and I join a thread, program chrashes
by BrowserUk (Patriarch) on Jun 11, 2009 at 14:29 UTC
    But apart from the bug with the time measurement the Coro side of the benchmark seems to be a valid benchmark.

    I don't wish to press the point, though I suppose I am by even mentioning it, but I'm not sure it makes much sense even as a standalone benchmark of Coro. I'll explain why, but don't feel the need to respond.

    What exactly is it benchmarking?

    • Given the output number: "multiplications ... per second", you might say multiplication...

      but of course it's Perl that's doing the multiplication, and if you take Coro out of the picture, Perl alone wins hands down.

    • And if the test is how quickly Coro can switch between its 'threads', with the multiplications as just the metric indicating how much (or little) time penalty Coro thread-switching imposes...

      Why not just set four threads running doing multiplies in a loop, and cedeing every N?

    • And if the purpose is to test the efficiency of Coro queues...

      Why bother with all the multiplying?

    I really cannot see the merit of the benchmark, as either a comparative study of Coro and iThreads, nor as a standalone test of Coro itself.

    One thing is for sure, if this is the basis of the POD claim: "A parallel matrix multiplication benchmark runs over 300 times faster on a single core than perl's pseudo-threads on a quad core using all four cores.", then quite frankly, he should be prosecuted by the Statistics Police :)

    And the sentence: Unlike the so-called "Perl threads" (which are not actually real threads but only the windows process emulation ported to unix, and as such act as processes), is a candidate for the I-know-what-you-were-trying-to-say-but-that-isn't-it of The Year award. :)

    I'll continue to endeavour to get Coro to build on my system, and if I succeed, I'll attempt to produce a fair comparison of matrix multiplication using both. Within the limitations of Coro, I believe that it would still show Coro in a good light. threads::shared memory is horribly and unnecessarily slow. I wish I could see how to address that. But the claim above is frankly ludicrous.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      I tried to resist but eventually I lost my resolve and responded ;-)

      What exactly is it benchmarking?

      My answer would be matrix multiplication. Ok, his test matrices are too small (which makes it a worst case benchmark). But in the proceedings of the perl workshop there is a diagram where matrix multiplications/s (not simply multiplications/s) are compared to the matrix size. The diagram shows that he tested variable matrix sizes, up to 1000x1000 matrices, and also used a different benchmark metric. PS: I found the diagram on the same server where the test script is, http://data.plan9.de/mat.png

      Naturally the coro-version is slower than pure perl. But the interesting thing is how much slower. Threads allow different programming styles or paradigms, for example producer/consumer relationships. How much is the penalty to do it this way instead of the simple iterative way?

      ...I'll attempt to produce a fair comparison...

      I'm anxious to hear those results. I even might show Marc Lehmann the results at the next perl workshop, if he is there.

        Hi, thanks for trying to dissect my benchmark (unfrotunately you failed to understand it, and spread a lot of fud - e.g. calling time takes the same amount of time in threads as in Coro, so if at all, ithreads have an advantage here because they can run it in parallel, while Coro threads cannot). I don't fail you, because ithreads are horribly confusing to most people, because they wrongly imply that they are threads,a nd they wrongly imply that they are useful for anything on non-windows machines.

        The talk I gave was about threading models in general in scripting languages.

        Threads are defined by a shared address space, so ithreads are not threads in the first place. Many people (including you) confuse them with real threads (as implemented by Coro), doubtlessly because of the badly chosen name.

        What I wanted to show, and without doubt succeeded, is that in data-sharing-extensive scenarious, ithreads totally lose, because they can't share data efficiently (which is normally the only advantage of threads).

        I also showed that data sharing is so slow with ithreads as to be useless. real processes certainly perform better than ithreads in about any scenario.

        the problem with ithreads is that they emulate unix processes (thats what they were written for), in software, something a unix system does efficiently in hardware.

        This cpu/mmu emulation costs around 25%, whether used or not. It also means that unlike unix processes, starting a pseudo ithreads process involves a very costly copy operation, doubling the amount of memory used (under unix only small amounts of memory are neecsssary).

        So, the bad points about ithreads is that they slow down every perl interpreter for a windows hack, that they slowly emulate in software what on a typical OS today does in hardware, and that it doesn't implement threads at all, as it emulates processes. Processes, of course, can be had faster and with much less overhead by, well, using OS processes instead of the emulated ones.

        What's in favour of ithreads? The only thing is the existing API that allows yout o create and join processes easily, and implements data copying between them. That advantage is, however, no longer existing now that theAPI-comaptible "forks" module exists, which implements the ithreads API using real processes, outperforming it greatly in most cases.

        So in summary, ithreads are a total misnomer, as they don't even implement threads. They should not be enabled on non-windows perls for that reason, as non-windows perls have a hardware-assited process implementation, which cna be used instead.

        The benchmark is meant to illustrate how expensive it is to treat ithreads as if they were real threads (both in program complexity as well as in cpu time), compared to a real threads package.

        Without doubt, it did succeed in that.