Someone once said that a web application is truelly scalable if all one needs to do is to buy more machines. This is very appealing to me. Imagine you start a website, first with one box containing both the web server and database server; then as your users grow, you just add servers and load balancers; and eventually you add data centers (host in both west and east coast, europe, etc.) Ideally this should be done in a high availability fashion.

My question is what are the main design decisions to accomplish this? (e.g., not storing anything in sessions?, but load balancers can be sticky), cluster of webserver and database servers? how does database replication/synchronization work across data centers (how fast and how expensive?) How does the big boys (google, yahoo, ebay, amazon etc.) accomplish their scalability? Anything special about Perl/mod_perl one can use or needs to be careful about? Thanks.

Replies are listed 'Best First'.
Re: Hardware scalable web architecture
by BrowserUk (Patriarch) on Apr 10, 2006 at 07:41 UTC

    There is a pretty good high level overview of the architecture Google use at (pdf)


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Hardware scalable web architecture
by fergal (Chaplain) on Apr 10, 2006 at 09:49 UTC

    "truly scalable" basically means paralellisable - that is each machine can operate independently of each other machine. There aren't so many interesting web apps where this is true. For example, anything where users can create content that needs to be visible by all other users will require a database that is read by all machines and this will be your bottleneck to scalability.

    An example of something that is truly scalable is a news site (excluding comments). You simply push the new content out to each machine and let them serve. If you need more capacity, you add more machines. They can even be in different parts of the world.

    You can mix the two situations to help with scaling. So for example you have 100 main web servers which weave together static and dynamic content. They don't talk to the database, that's left to the 20 dynamic content servers which generate HTML and pass it to the main web servers. They might also do caching etc. This way you limit the number of machines holding open DB connections. It doesn't matter so much how your static content grows, you could start serving videos instead of text, because that's handled by the main web servers and you can add lots more of those. You're still limited in how much dynamic content you can add because every piece of dynamic content comes from the DB.

    The "big boys" are lucky. Plain old web search requires no user data, all you need is a cluster of machines with the data. You can scale with user growth by simply throwing another cluster at it. Scaling with data growth is going to be the hard part.

      One way to better scale database interaction is to set up slave DB servers which are used for querying the data. The master DB server is only used for inserting new data and replicating it to the slaves. Architecturally this is a lot easier than a truly replicated database setup (with multiple masters that can be used for both reading and writing), and several such solutions exist (see for example Slony for postgresql). fergal's point still applies though, not all applications can benefit from this kind of setup.

      So if you want to leave this option open for the future while building your webapp you should use different handles for reading and writing to the DB. Those can point to the same server during first deployment and later be changed to access the master and slave(s) when you need to scale.


      All dogma is stupid.
Re: Hardware scalable web architecture
by wazoox (Prior) on Apr 10, 2006 at 10:49 UTC
    You can store data in sessions, but you may need to be able to share the sessions between servers, either storing them in a database, or on some shared storage (NFS server).
    Load balancing can be tricky too, because you don't want the load-balancer itself to be the single point-of-failure... That's why most people use the simplest load-balancer, round-robin DNS (each new incoming connection resolves to a different server, and you have at least two DNSes, don't you ? ).
Re: Hardware scalable web architecture
by rhesa (Vicar) on Apr 10, 2006 at 14:11 UTC
    I'm still in the progress of making our company's app fully scalable. The only outstanding issue is the file server: I have yet to find a good solution to mirror our nfs server. Our data is in de low terabytes at the moment, and growing by a couple of gigabytes per day. A typical rsync run takes a couple of hours, so that's not a real-time solution. Does anyone have good pointers on this?

    As for the rest, everything is redundant. We have 4 tiers:

    1. Static front-end Apaches
    2. mod_perl Apaches (*)
    3. Replicated MySQL servers
    4. A big fileserver
    (*) Actually, this layer also has a number of Apaches setup to do straight cgi, for those long-running requests like file upload.

    Here's some links I found useful in setting this up:

    So that still leaves me with the file server. I haven't been able to investigate clustered filesystems such as GFS, Coda and others yet, but there may be a solution there.

    And the other big thing I'm not trying to worry about just yet is how to mirror across data centers. I'll cross that bridge when we get there :)

      To secure your NFS servers, see for instance : http://linux-ha.org/HaNFS . It's pretty easy but using a SAN is mandatory. Of course, if you're building an HA system you're probably using serious RAID arrays, SAN and the likes anyway.
        Thanks for that link! I'm not sure whether it will work in our setup, but I'll look into it.

        We're currently using an AoE (ATA over Ethernet) device (made by Coraid). But I have to admit that I don't know much about this stuff: it was all set up by our ISP.