stonecolddevin has asked for the wisdom of the Perl Monks concerning the following question:

Hi Folks,

I'm about to embark on the largest web application venture I've ever been exposed to and for a very reasonable amount of compensation. This project will include it all: terabytes of space to work with, multiple read/write databases, possible server clusters, and the necessity for scalability.

I'll be using a pre-existing software package to modify and create this large application, and will have a good deal of time to do it in.

My 4 questions are:

1. What do I need to know about large web applications written in perl using a mysql backend?

2. Where can I find resources on programming for server clusters? (web front end and mysql clusters)

3. What tools should I use for laying out the architecture of the project? (software architecture software, UML diagrams, etc.)

4. What other things am I going to need to take into consideration when I actually go to sit down and start coding this?

Grant me your wisdom those who have it.


Replies are listed 'Best First'.
Re: Large Web Application Ponderings
by dragonchild (Archbishop) on Nov 09, 2007 at 02:43 UTC
    Unless you have already done so, you will need to hire people with the following skillsets:
    • DBA
    • Sysadmin

    Unless you have personally been a DBA or sysadmin on a large project, you will mess it up. This isn't optional. There are a number of critical FAIL-if-you-don't-take-into-account considerations that a programmer will never think of, because that's not their job. I am both a programmer and a DBA, but I'm not a sysadmin, so I make sure I have a list of very good sysadmins I can call when I need to. Just as I'm on that list of DBAs for many monks. Unless you have done this, everything else is completely and utterly worthless.

    I'm saying that as someone who's been DBA and developer on a number of large web applications written in Perl using a MySQL backend. If you would like some further advice on the matter, feel free to contact me either here on at

    Beyond that, it's the basics. Pick a web framework, an ORM, webserver, and application server. I personally prefer Catalyst, DBIx::Class, Apache, and FCGI. I know smart people who like Jifty, Jifty, Apache, and mod_perl. There's a number of choices. The key is to pick one and learn it inside and out.

    As for laying out the app, screw the documentation crap. Build it with tracer bullets and week-long deliverables. If you can't break a piece down to week-long chunks, look again.

    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?

      I appreciate your frank response dragonchild. This is obviously something I need to bring up to my client, so what is the recommended course of action as far as telling my client they need to hire someone with those skill sets as well?

      Also, what do you mean by break it down into "week long chunks"? And how do you feel about using a pre-existing software package and modifying it? I had been thinking about nearly copying and pasting the logic into a Catalyst application and modifying it from there, because i KNOW how to set up Catalyst with FastCGI and use DBIx::Class with it, and I KNOW Catalyst can handle the load.

      Once again, I really appreciate the matter of fact advice

Re: Large Web Application Ponderings
by perrin (Chancellor) on Nov 09, 2007 at 03:41 UTC
    If you don't know where to start, try a book or a case study. Building Scalable Websites is kind of all over the place, but contains a couple of useful chapters. The Practical mod_perl book is good if you use mod_perl, and some of it is good even if you don't. I wrote an article about my experiences at one large site.
Re: Large Web Application Ponderings
by Cabrion (Friar) on Nov 09, 2007 at 11:14 UTC

    You'll likely get a lot of good advice from your questions, but the truly wise are those that seek answers to the questions they didn't even know to ask.

    That would be your question #4.

    • Isolated test, QA & development environments
    • QA - unit/regression test management & load testing
    • Source code control - consistent procedures for fork/merge, etc.
    These coalesce into a higher order problem of version/revision control and roll-out/roll-back that will bite hard later on. When the apps get large and heavily used, rolling out a new version that doesn't break everything can be a real pain.

      Good points, I would definitely recommend looking into Selenium ( It's a tool that lets you record a browser session and then export it as a test (you can export to a number of languages, including perl). The tests can then be rerun at any time to make sure you didn't break anything. It's not only good for automated web interface testing, but also for things like populating a new install with data so you can run your tests against populated systems.
Re: Large Web Application Ponderings
by Your Mother (Archbishop) on Nov 09, 2007 at 20:14 UTC

    Dude, I love this question and hope you get lots of great answers (I'd love to learn from it too, this is just about where I'm finding myself at this point) but dragonchild is speaking gospel. If you have to ask these questions, you're probably not ready to do the project (at least not alone). I'd say, go slowly and try to emulate other successful, *similar* architectures.

    As far as books go, I just picked up Building Scalable Web Sites and Scalable Internet Architectures a couple of weeks ago. I find the first to be the worst O'Reilly title I own -- I'm only halfway through and having trouble forcing myself to read the rest -- packed with gems like: "here's a great technique, but it doesn't scale" and "you don't have to consider race conditions in your architecture as long as they're pretty rare." Scalable Internet Architectures I like much better. A good road map for real definitions and real planning. Flip through it at the bookstore and pick up a copy if you like it.

Re: Large Web Application Ponderings
by leocharre (Priest) on Nov 11, 2007 at 17:12 UTC

    There is no such thing as a web application, there is only a web interface. Personally- this is what I do.. If I have a FastFood application, my tests.t,, bin/fastfood, FastFood/, etc code does nothing at all with web anything.
    I develop FastFood in a terminal environment. Test it all out etc.
    Once I am somewhat happy with that, I will start work on FastFood::WUI (Web User Interface).

    I have come to great affection for CGI::Application, so my WUIs are CGI::Application based interfaces to ..
    whatever- in this case; FastFood.
    Thus, my fastfood.cgi only uses FastFood::WUI, not FastFood, FastFood::Base, none of that.

    If you want to use CGI::Application- one word of advice; use as many of the plugins as possible, if they do what you want- even if what the way they do it seems counter-intuitive to you. I've found that the plugins seem at first to do some unneeded stuff.. but when I proceed with it.. things make more sense. Almost as if.. the people that came before me knew what they were doing when they suggested these ways of working.. go figure!

    Compared to SQLite, in my experience- mysql is slow at inserts and fast on queries; consider doing any heavy lifting, operations.. cpu intense whatevers.. all offline, that is.. drop a crontab entry.

    Keep focus on the web interface *as the web interface* and NO more. Any operations, db interactions etc.. Code all of that in your base modules. If the code is not about taking user input or displaying output- it does not belong in the "web app", the web user interface, the WUI.

    My final suggestion would be to look up Conway's book on Perl Best Practices, although you probably have it already. He talks about doccumentation in there somewhere, I believe he also suggests writing the docs first, then the code. One very disciplined way to work is to write the tests first, as how you would like to interact with the non existing API- and then write your API so that the tests do not fail- I use Test::Simple- very sexy.

      Compared to SQLite, in my experience- mysql is slow at inserts and fast on queries; consider doing any heavy lifting, operations.. cpu intense whatevers.. all offline, that is.. drop a crontab entry.

      Spoken as a true programmer ... as in, someone who isn't a DBA. The SQLite DBMS is a great toy. It can be a useful development aid. But, unlike a RDBMS, SQLite doesn't enforce relationships, types, or anything that one would need to ensure data integrity. It doesn't have transactions (ACID or not), proper indices, and a lot of other stuff. In other words, the moment your application started being real, SQLite would do anything from fall over immediately (best case) to corrupt your data without you knowing it (worst case).

      While I prefer MySQL (using the InnoDB tabletype only), PostgreSQL, Oracle, DB2, Sybase / SQL*Server ... anything that's ACID-transaction compliant, provides proper foreign keys, indices and the rest of it is necessary for a proper application. This isn't optional.

      The rest of what you said - the WUI thing is a really cool concept that I think I'll be grabbing. But, data is the foundation of an application - treat it properly or it will beat you senseless.

      My criteria for good software:
      1. Does it work?
      2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
Re: Large Web Application Ponderings
by clumsyjedi (Acolyte) on Nov 12, 2007 at 17:53 UTC
    This thread on the london perl mongers list elicited many thoughtful responses on this topic from many clever people. It's worth a read.

    The original poster started getting stroppy with the people helping him after a while, and the whole thing degenerated pretty quickly which is, IMHO, also worth a read.


Re: Large Web Application Ponderings
by sundialsvc4 (Abbot) on Nov 13, 2007 at 19:30 UTC

    The first thing to do ... the very first thing! ... is to determine exactly what you need to do, and to lay out an approximate timeline for doing it.

    This is not simply a "technical" task. In other words, "MySQL" and "clusters" and so-forth are merely parameters of the project's implementation. Their presence creates yet-another requirement that you must schedule for, namely: determining how you're going to deal with those issues. That will mean research, and you can't just "seat of your pants" any research-related issue. You must formally investigate them and formally come up with an answer to every one.

    In any web-application, it's axiomatic that "the user neither knows nor cares how it's done." The user, and therefore your custome, cares only that it does or does-not meet their expectations for the site. An "unspoken" expectation can kill you.

    As you proceed through first the planning-stage of your process and then its execution, make definite milestones and then hit them. Don't allow change to come into the project willy-nilly: if the executive vice-president wanders into your office one day with a new starry-eyed idea that he just lifted from iTunes or Google or what-have-you, don't let him or her "swish!" that into the project or your schedule.

    Many open-source projects swear by the maxim "build early and often." Ditto: "test." That's a very good principle to adopt and stick-to.

    Insist on weekly detailed meetings with your boss or client. What you view as "being well-compensated" is, to someone else, "bloody expensive!" Whatever you do, always remember that! Review your progress, discuss any and all issues, and keep them thoroughly in-the-loop. You will be judged, and you will be paid or not-paid, more by "expectations" than by "results."

Re: Large Web Application Ponderings
by valdez (Monsignor) on Nov 12, 2007 at 22:11 UTC