geektron has asked for the wisdom of the Perl Monks concerning the following question:

i haven't been around for a while. my current client wanted everything done in another language ( PHP ), so i've been adding new toolsets to the toolbox.

anyway, a new client is allowing me to dictate the architecture of the site, and i've been strongly leaning towards mod_perl for it. i will more than likely control the production ( and dev ) environments for it ... so i'm thinking, why not?

problem is, other than 'i need mod_perl experience' and 'well, the site *is* fairly database intensive' ( a real-estate listing site ) i can't really justify the architecture.

my question: does the DB connection merit the change in architecture from 'standard' cgi/perl dev into mod_perl dev? i can't think of the need for session handling right now ( it's just a search-and-display type of app ). i also don't think any heavy logging is going to need to be done.

my mod_perl inexperience is showing through, obviously.

Replies are listed 'Best First'.
(jeffa) Re: thinking about mod_perl
by jeffa (Bishop) on Jul 19, 2003 at 16:21 UTC
    mod_perl gains a lot of it's performance from the fact that the Perl interpreter is compiled into Apache. This means that you don't have to fork and exec the Perl interpreter for every request. As far as database connections, your application will speed up by utilizing Apache::DBI. Instead of connecting and disconnecting everytime a page is requested, you maintain a persistent database connection. Personally, i think you should try out mod_perl. :) There are at least three books out right now to help you:
    1. Practical mod_perl
    2. mod_perl Developer's Cookbook
    3. Writing Apache Modules with Perl and C
    Good stuff. :)

    jeffa

    L-LL-L--L-LL-L--L-LL-L--
    -R--R-RR-R--R-RR-R--R-RR
    B--B--B--B--B--B--B--B--
    H---H---H---H---H---H---
    (the triplet paradiddle with high-hat)
    

      In addition to embedding the Perl interpreter into the Apache process, one of the really big reasons you can get such a large speed increase is because you only have to load and compile your scripts and the modules they use once for the life of the Apache process if you use Apache::Registry (the most common method of running CGIish applications under mod_perl). You do need to make sure your programs will work ok being run this way, for the most part if they'll run under use strict; you'll be okay.

      Even if you can't get your scripts to run that way (for instance if you have to get a large bunch of Perl4 code working under mod_perl) you can still take advantage of the embedded Perl interpreter by using Apache::PerlRun. There's more information over at the mod_perl documentation site and I'd particularly recommend reading the porting to mod_perl guide which spells out most of the differences between writing scripts for CGI execution and for execution under mod_perl.

        That "for the most part" comment is dangerously misleading.

        If you have declared script-level variables with my, then turning on warnings will give you "Cannot stay shared" warnings. Do not ignore this! It will seem to work under light testing. But it will lead to persistent weird errors in production, errors like someone occasionally getting data from someone else's page for no good reason.

        What goes wrong is that the first time a given Apache process loads the page it works just fine. The second time you load it, those variables declared with my that are accessed within functions in the script are from the first request, but within the main body of the script are from the second.

        If you wish to understand why, read Re (tilly) 9: Why are closures cool? and then realize that Apache::Registry works by putting a function around your entire script, eval's that, and then calls that function for each request. Which creates exactly the configuration of that node. So the my variables are accessed within your script just fine - but in functions in your script you get the first request that that Apache process saw. (And in light testing you can easily miss this.)

        If you don't wish to understand why, just remember to turn warnings on, and don't ignore the message that they give. In particular the shared warnings can generally be fixed by declaring the variable in question to be global.

        Incidentally the porting link you provide covers this point first because so many people get it wrong and then don't understand the weird bugs that they get. Which is why it is important that anyone telling others how to do it should do likewise, so that we don't get more confused people who only look for the code example, port their code, see that it is faster, and then wonder where the weird bugs that people are complaining about came from. (Note that common mis-advice for how to apply strict.pm is, "Just declare everything with my" - which compounds the problem.)

      i've been reading though the online mod_perl guide. i read the third book mentioned a while ago but it's about to come off the shelf.

      i'm trying to decide which is going to be better -- writing the app and running it under Apache::Registry, or writing an app that starts with mod_perl ( Apache::DBI, etc ) and stays there. i don't see the client moving the site to another server ... so i think it's safe to write a 100% mod_perl app.

        mod_perl is pretty common in OpenSource OSs. As an example, Redhat has provided apache w/ mod_perl since rh 6.1 or 6.2.

        Apache::Registry does give you that 'out' though. I actually did some code that way - so it would run both with and wihout mod_perl. In fact I did it both ways on the same box, in the same directory, using a Location directive to use apacche registry if the url was, eg, /perl/... and letting it be regular CGI if the URL was, eg, /cgi-bin/... it's not hard

        --Bob Niederman, http://bob-n.com
Re: thinking about mod_perl
by cees (Curate) on Jul 19, 2003 at 17:24 UTC

    If you are building your application using perl, then there is no reason not to consider mod_perl. It is simple to write your applications so that they work both as plain CGI scripts and work correctly under mod_perl. All you need to do is follow the safe coding guidlines that mod_perl suggest when building your application. Most of these suggestions are good coding practise anyway and will result in better code.

    A good start is to always use strict, and to use a good framework for building your application. I personally use CGI::Application which goes a great job of organizing your code, and is fully compatible with mod_perl.

    Read the docs at http://perl.apache.org/, and checkout the CGI to mod_perl Porting. mod_perl Coding guidelines.

    Of course mod_perl offers a huge amount more then just speeding up your CGI scripts, but I would suggest starting simple...

    - Cees

       use strict; flies onto the screen right after the proper #! declaration for me.

      I'm partial to Template::Toolkit for systems ... that way everything becomes *roughly* MVC, and is easier for me to debug . display problems are in one place, logic problems in another.

        My reason for using CGI::Application also stems from trying to follow the MVC pattern. CGI::Application acts as the Controller and suggests using HTML::Template for the Views (although many people use TT in place of H::T). For the Model I usually create Class::DBI modules of my tables and litter them with many helper functions.

        The other reason why CGI::Application is so good to use with mod_perl is that the actual CGI script is very small. All of the code is put into modules. The reason this is important when using mod_perl is explained by tilly's comment above. The actual CGI script is only a couple of lines long and hence you will not have problems with the "Cannot stay shared" issues that many legacy CGI scripts suffer from.

        I'm not trying to turn this into a 'My Way is better than Your Way". But something you should get out of this, is that it is very beneficial to place the majority of your code into modules instead of building a monolithic script littered with functions, especially when you are using mod_perl.

        - Cees

Re: thinking about mod_perl
by perrin (Chancellor) on Jul 19, 2003 at 21:36 UTC
    It comes down to this: if you use vanilla CGI, your application will be slower than the same thing built in PHP. If you build it in mod_perl or FastCGI or SpeedyCGI, it will be faster than PHP. There are various benchmarks out there showing this, including one from Yahoo.
Re: thinking about mod_perl
by jonadab (Parson) on Jul 19, 2003 at 18:10 UTC

    If you need some hype^H^H^H^Hreasons why mod_perl is better than vanilla Apache and perl, check out the PHP documentation, and s/PHP/mod_perl/; technically this is all bunk, since mod_perl (and PHP) still uses the common gateway interface just like a vanilla Apache/perl installation does, but it manages to convince a lot of people that PHP is better than Perl, so it ought to work just as well at explaining the superiority of mod_perl.


    Quidquid latine dictum sit altum viditur.
      A technical nit, mod_perl does not use the common gateway interface.

      If there is a resemblance, it is because the common gateway interface was designed based on the structure of an http request, and the commonalities between what a mod_perl job has to do and a CGI script have been intentionally exploited by the mod_perl people to make mod_perl easier to get into.

      Also note that mod_perl gives access to phases of Apache's handling of an http request that PHP does not and CGI cannot. (This is both good and bad - ISPs are more inclined to give people PHP rights on servers than mod_perl because the latter is harder to restrict.)

        mod_perl does not use the common gateway interface

        I realise the backend is implemented differently on the server side, but the interface between the browser and the server is the same. It has to be; if it weren't, it wouldn't work without special browser support. The same is true of PHP. This is what makes the gateway interface common, is it not?


        Quidquid latine dictum sit altum viditur.
      "still uses the common gateway interface" is true in itself, but definitely not the complete truth.

      With mod_perl you can have complete control over the handling of a request, allowing to forego the setting up of environment variables that are part of the CGI specification. Instead of environment variables, you have methods that can be called on a number of objects, representing the request, the connection, the configuration of the Apache server, etc,

      And since there is no external program being called, there is no concept of piping information to/from that program (as specified in the Common Gateway Interface). Although stdout is usually tied to what will be sent to the browser.

      The handling of the request can go pretty far: in mod_perl 2 it is even possible to have piece of Perl code executed before the connection is completed. This allows you to bar bad visitors from using any of your servers' resources, e.g. depending on IP-number.

      Liz