. There is a lot of literature about how to set up websites with no single points of failure. For instance you have a pair of load balancers configured for fail-over. If the primary goes down, the secondary takes over. No single points of failure is more reliable.
Remember that my proposition was that these are tomorrows commodity machines we're taking about. So, rather than todays 16-way cluster to ensure peak-time bandwidth, we only need two, but the cost of these machines is the same, so throw in a third for posterity.
For your ecommerce site requirements. Instead of running 16-way by 2-core cluster, you run 2-way by 8-core cluster. Your site will have the same headroom in processor power to deal with your peaks. You also have fail-over redundancy.
Which brings us to Google's map reduce. Suppose you have a job that will run on 2000 machines and takes 5 hours.
See below.
A big factor that I think you're missing is that keeping RAM on takes electricity.
I don't believe I have missed that. Memory mapped files do not (have to or usually) exist entirely in RAM, they can and are paged on and off disk. Often through relatively small windows. The benefit the 64-bit address space is simplicity of the mapping which gets messy when mapping a 2 or 4GB address space over 1 (or more) >4GB size files.
RAM uses less power than disks, 1/16 (or greater) reduction in numbers of disks; 1/8 reduction in most other components. That means that even if you keep the same volume of RAM but distribute it into 1/8 th as many machines, you're already saving energy. And the power draw of each generation of RAM chips/modules has either remained static or fallen, whilst the capacity has quadrupled or more with each generation.
By keeping intermediate results in RAM and performing previously serial processes on it, means that you are also able to more fully utilise the time and processors.
In your example above. 1 job 2000*5 hours of processing. Same job takes 250*2 hours. 10,000 : 500 == 20:1 saving in time to process, but that's not (just) an efficiency saving. It's also a 20:1 energy saving as you don't have to (continually) run 2000 machines to process the job in a timely manner. You also have quiet-time upgrade ability.
And when you talk about AJAX, you've made some big assumptions that are mostly wrong (at least in today's world). Any thread of execution that is doing dynamic stuff takes up lots of resources. Be it memory, database handles, or whatever. As a result standard high performance architectures go to lengths to make the heavyweight dynamic stuff move as fast as possible from client to client.
I think you're wrong on this, but I do not know enough of current practice on AJAX sites to counter you.
I disagree about Perl's main failing. Perl's main failing here is not that Perl doesn't recognize that sometimes you want to be concurrent and sometimes not, it is that there are a lot of operations in Perl that have internal side effects that you wouldn't expect to.
I opened with "Due to its inherently side-effectful nature, Perl is not the right language for multi-threaded programming." and I don't think I said anything that later contradict that?
I wouldn't worry about the practical difficulties.
I really wasn't.
My point was that without the " the hidden semantics", the parallelisation of DB operations is a natural big win, but with those semantics it's much less so. Whilst DB operations remain an out-of-box, cross-network, many-to-one communications affair, the communications overheads remain constant and contention dominates.
Once you have the processor(s), address space and memory to move the DB into the same box as the application programs, the communications overheads disappear. Instead of the DBM process sharing it's limited address space between the demands of multiple callers, the common code runs in the address space of the caller.
With 64-bit address space, instead of storing your tables and indexes in myriad separate files, each database becomes a single huge file mapped into memory on demand, and cached there. Disk/file locks become memory locks.
Think of it like this. Just as virtual addressing is used to provide swapping for processes now, so it gets used to provide shared memory access to your databases. All the logic remains the same. Locking, caching, the works. The difference is that it all happens at RAM speed rather than disk speed, and you lost all the communications overhead and the serialisation of results through (relatively) low-speed, high latency sockets and the need for DB handles simply disappears.
In reply to Re^12: Parrot, threads & fears for the future.
by BrowserUk
in thread Parrot, threads & fears for the future.
by BrowserUk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |