Re: Are you coding websites for massive scaleability?

My little bit of experience about perl webapps, infrastructure and scalability.

I am using a lightweight front web server (nginx but lighttpd is in the same category) that serves the static content and forwards the dynamic request to the backend. Here you have 2 options for the backend: either proxy to another http server (may be apache with mod_perl) or use FastCGI. Fcgi processes could live in the beginning on the same server and later they could be migrated to separate servers.

My experience (comparing fcgi and mod_perl+apache) is that you could get better performance with fcgi - actually very close in terms of request per second, but with less deviation and less failed requests.

Another advantage for FCGI is that is easier to administrate - you could move portions of the site from server to server, relaunch just some parts of the site, give different privileges to different parts of the site (think of uploads, writable directories, number of processes, memory limits etc.)

Now for the perl side. I am using CGI::Application as base framework - it is very lightweight and easy to customize. You could have some problems with certain C::App plugins that do not play well with FCGI. One of my problems was that some of them are thought to create object per request. So I have replaced them with some custom code (that should soon land on CPAN). Now I could create webapp object (with heavy initialization) only once and just pass the request to it. This gives major performance boost (I was unable to do the same with mod_perl+C::App).

Now for the DB side. I use plain DBI (with some helper functions). The reason for this is that most of the SQL queries that I write could not be expressed easily or could not at all be done though some abstraction layer. Think of this: disable MERGE JOINS in this session, make query, revert default planner behavior. Why choose this route? Because the database engines are coded to manipulate data quite fast - you could not match them in perl - so tell the DB to give you only this data that you need, ordered in the order that you need and in least possible requests.

Another paragraph about database scaling. One way is to use some king of full replication and common pool for the connections. Another option is database partitioning. If you could partition your data on some relatively autonomous domains you could partition it on some number of different DB servers (replicating only some small portion of the data) - this gives you more scalability than just plain replication. (table partitioning is another story)

And finally. On some of your questions:

Yes, I have participated in development of sites and systems that run on more than one server. Example: IPTV for Bulgaria that peaks @ 10+К concurrent streams, 100+K unique visitors per day.
No, I have not written unit tests for it. For some other projects I have written, but not the web portion
My personal opinion is that it is better to plan early for scalability because later it is hard or impossible to add. I have seen some projects die because of this

Best regards

Comment on Re: Are you coding websites for massive scaleability?