Well, you've certainly reminded me why I hate and fear multithreading.

The problems I describe are not confined to "multi-threading"; they affect multi-processing of all forms.

For example, the typical approach to multiprocessing web applications is to use pre-forking server to run the code and an RDBMS to store the data. The oft misunderstood 'benefit' of this approach is that as forks don't share state, they don't need locking. But this completely misses that all that has happened is the shared state has been moved into the DB and it has to do the locking for you, and so suffers from all the same problems of exponential lock contention as the number of clients trying to access the same dataset increase.

It is exactly these problems with data access through a central "database manager" not scaling to deal with hyper-scale web applications that is the driving force behind the move away from RDBMSs in favour of the whole raft of distributed management data stores broadly categorised under the title NoSQL. Hence you get Google's BigTable; CouchDB; MongoDB; Terastore etc.

Back in the days when the biggest distributed apps were banks and credit cards with a few 10,000s of clients processing a few millions of data accesses per day, routing all those accesses through a central DBM worked. It required BigIron, highly structured and indexed data and very few, very well-defined queries, but it worked and worked well.

Then suddenly you get hyper-scale web applications where you have millions of concurrent clients and billions of transactions every day, asking a myriad of free-form queries against huge and broadly unstructured datasets. Then, having all your clients talking to one central DBM managing one huge data store is not just hugely expensive it is quite simply impossible. BigIron cannot get that big. And so 'the cloud' was born.

The only way forward is to distribute your dataset. Note that distribute is not the same as replicate. Subset the dataset into manageable chunks and have different processors (or clusters of processors) managing those discrete chunks. But then, your clients can no longer talk directly to a single DBM because each DBM only has access to a small subset of the overall data. Instead, clients talk to lightweight front-ends that know enough to be able to break up the inbound query into sub-queries which they route to the distributed DBMs as required. They then gather the various responses from those back-end DBMs and collate the results before finally wrapping it in the presentation layer and sending the reply back to the client.

It requires new architectures and new thinking, but the result is that applications can be scaled by expanding width-ways -- adding more cheap, commodity boxes at the front or back as required -- rather than having to buy bigger and bigger individual boxes at both ends as used to be the case.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

In reply to Re^5: Multithreaded (or similar) access to a complex data structure by BrowserUk
in thread Multithreaded (or similar) access to a complex data structure by FloydATC

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.