in reply to Taking advantage of dual processors [tag://parallel,cygwin,multi-core,fork,i/o]

You want one process to block on IO (at least block buffered, if not more) while the other process blocks on a database commit?

This doesn't seem like the type of problem where concurrency will help you much.

  • Comment on Re: Taking advantage of dual processors

Replies are listed 'Best First'.
Re^2: Taking advantage of dual processors
by BrowserUk (Patriarch) on Nov 19, 2007 at 20:34 UTC

    I'm confused why you think it wouldn't?

    Done serially with '.'s representing time taken:

    v---------------------------------| read........munge....insert........

    versus overlapping the insert and the read in parallel:

    v-----------------------| read........munge.....QI. v-----------------------| DQ.insert........Wait....

    Even on a single cpu system, and with the DB running on the same box, the insert can go ahead whilst the disk head locates the next chunk of data.

    It will depend upon the relative expense of the overlapped operations, but on a multi-cpu system the elapsed time should be reduced?


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      Does SQLite perform blocking flushes on every commit? Does it perform asynchronous writes?

      Is the file to read on the same bus as the database file? Are they on the same drive? Are they on the same platter?

      It would surprise me to see much more than a 15% improvement from concurrency, and it wouldn't surprise me at all to see performance decrease.

        Does SQLite perform blocking flushes on every commit? Does it perform asynchronous writes? Is the file to read on the same bus as the database file? Are they on the same drive?

        All good questions, but I bet it took me far less time to write the thread code I posted, than it would take the OP to ascertain the answers. And much less than it would take him to a) understand the significance of his findings; and b) construct a formula to determine whether concurrency would be beneficial or not in the light of those findings.

        It would surprise me to see much more than a 15% improvement from concurrency,

        Worth having if it's available. And running the benchmark would take very little time.

        and it wouldn't surprise me at all to see performance decrease.

        And it's only 10 or so lines of code and say half an hours effort to discard if not.

        Are they on the same platter?

        Do modern drives use multiple platters? In any case, it is unlikely that there is any filesystem that would allow you to control the placement of individual files at that level.

        I must admit that I think there is probably more potential for performance gain in avoiding (at least) 1 of the 3x duplications inherent in the following code:

        my $d = Data::Dumper->new([\@dump], ['dump']) ; $d->Purity(1)->Terse(1)->Deepcopy(1); ... ..., $d->Dump);

        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re^2: Taking advantage of dual processors
by metaperl (Curate) on Nov 19, 2007 at 21:19 UTC
    Well, what I want is maximum throughput for two operations which are currently sequential. It seems like dispatching the database commit to a 2nd cpu would allow the first one to continue reading, but I really dont know.
    I have beheld the tarball of 22.1 on ftp.gnu.org with my own eyes. How can you say that there is no God in the Church of Emacs? -- David Kastrup

      Why don't you try it? make these extra processes happen in a controlled fashion. Even without having a pool of permanent sqlite writers getting input from a queue which would be the right way to do it (as BrowserUK pointed out even though its example uses threads), you can still quickly prototype with the code you've shown using the following: keep an array of n max. commits-to-do and every m <n lines fire m processes system("my_writer $update &"), throttle a bit if ps (or even /proc) gives you too many my_writer processes, increase if your process count drops below a threshold (can be made adaptive...). Decide what to do if you reach n. (slow down, wait for #proc < n_min etc...) cheers --stephan