in reply to Re^5: Timing concerns on PerlEX/Mod_Perl
in thread Timing concerns on PerlEX/Mod_Perl

Well if I run the script with 2-3 people using it. It runs hassle free. The issue happens when I make it live.(but CPU usage doesn't go up)

I can try though..but since the code is written in a centralized environment. It is more dynamic, example:

I write in html: <*Get List|type:all*>

so only thing I can really comment out are the objects. Rest is kidna dependent. For a better understanding here is my code structure:

I have 4 name spaces:

main - the main namespace which calls the other namespaces and handles outputting of data from common(headers) and frame(html).

common - stores all the basic stuff like opening files, accessing database and etc. it also loads query strings and headers.

frame - loads the html, the advanced functions like login account and etc which is made up of calling the common. After loading HTML it loads the Skin namespace on the loaded html.

Skin - loads the layout, all the images, textboxes, forms, assigns ids and other more advanced objects.



Sigh..my urge to have a centralized structure is probably my biggest downfall..its good and convenient, but if the core has issues, makes it much harder to debug :(
  • Comment on Re^6: Timing concerns on PerlEX/Mod_Perl

Replies are listed 'Best First'.
Re^7: Timing concerns on PerlEX/Mod_Perl
by MidLifeXis (Monsignor) on Jul 26, 2008 at 14:41 UTC

    Well if I run the script with 2-3 people using it. It runs hassle free. The issue happens when I make it live.(but CPU usage doesn't go up)

    Have you tried loading your site up from 3 clients to whatever your production load is? If your server is only set to handle a certain number of requests at a time, your other requests may be waiting behind those that are currently processing. These waiting requests would not necessarily cause the load of the machine to increase.

    Try bringing up the number of clients slowly and see when the problem starts. Then tune your web server / web farm to handle the required number of connections.

    --MidLifeXis

Re^7: Timing concerns on PerlEX/Mod_Perl
by BrowserUk (Patriarch) on Jul 26, 2008 at 19:06 UTC
    my urge to have a centralized structure is probably my biggest downfall

    It seems to me that you know what your problem is, but just don't want to admit it (to yourself) :)

    Vis.

    I have 4 name spaces: main - the main namespace which calls the other namespaces and handles outputting of data from common(headers) and frame(html). common - stores all the basic stuff like opening files, accessing database and etc. it also loads query strings and headers. frame - loads the html, the advanced functions like login account and etc which is made up of calling the common. After loading HTML it loads the Skin namespace on the loaded html. Skin - loads the layout, all the images, textboxes, forms, assigns ids and other more advanced objects.

    You're replicating everything, for every request. And hoping that mod_perl/perlez will take care of things. I think you are expecting far too much of them.

    When I said earlier that I know nowt about them, I meant that I know nothing about tuning them. I do know what they do, but have never had the desire to use them. They seem altogether too hackish to me.

    I've no use for Apache either. For example, you said in your OP that your memory consumption is approaching 1.7GB. In that same space I can install 3500 concurrent copies of TinyWeb, each capacble of servicing dozens of concurrent requests.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Well centralized system is good..when it works... example: perl..if the machine code makes perl possible its good..but when there is a fault with the machine code itself..it makes things more difficult...Due to the way the site is, a central system just seems to make more sense

      Plus 3500 is kind of cheating, 2 reasons being is out of the 1.7gb, I would say 900mb or so is the actual number. The rest are things like windows services and etc...2nd is even if you do run 3,500 instances, they would each also spawn their own perl.exe, which in turn would consume more resources, no?

      ModPerl/PerlEX isn't really that bad..because it saves on starting up. perl's dll itself is around 900kb. Loading it up 1000 times would mean 900,000kb. So even if I were not to use it's pre-loading abilities, I save up a lot on startup. The downside though is their lack of documentation and abilities to debug. Every time I do an update to the code forces me to reset the entire server and start it up again. This makes debugging it live environments living hell.

      I am probably gonna have to for now just buy more servers and load balance it till I can find a solution to the issue. I was hopping though if anyone was familiar with the way modperl/perlex works and experienced similar issues. Since well, the execution of the code is only 0.2 seconds..but it takes 20 seconds to start up..which doesn't make sense...other then when making threads it having issues accessing the same namespace. Also one thing perlex doesn't use main as its primary namespace but instead uses PerlEX::Instance ID blah blah blah..so I am thinking maybe I am forcing it into using thus causing slowdowns..but these are all *hunches* and I can't say for sure, so while I try things, I am hoping maybe someone familiar with what could be the issue.

        First off, there would be no point in running 3500 instances. It was just by way of example to show what a resource hog Apache is.

        even if you do run 3,500 instances, they would each also spawn their own perl.exe, which in turn would consume more resources, no?

        Yes & no. Yes, each would run it's own copy of Perl. No, that wouldn't consume vast amounts of resource. Under win32 (and probably under *nix, but that's not my domain), when you run a second copy of an executable, the executable and static data segments of the process are shared. Eg. Only one copy is loaded into memory. Only the stack and heap segments are unique. So starting a second copy of either tiny.exe or perl.exe costs very little. Just their stack and heap allocations, and they can be set very small and allowed to grow on demand.

        In theory, when Apache/mod_perl forks, the preloaded chunks of Perl code are shared by COW--BUT IT AIN'T TRUE!. Everytime a forked copy executes code from the preloaded cache, and does any one of a number of simple things: like taking a reference (to anything!); or incrementing, or decrementing, or in some cases, even just printing the value of, a scalar, whole chunks of the COW-"shared memory" have to be allocated and copied. So, the mod_perl hack to avoid loading time just trades that for piecemeal, on-the-fly memory allocations and copying. And the more you preload, the worse it gets. Hence your problems I think.

        Conversely, perl cgi scripts are individually quite small (when compared to their loaded footprint), and modern servers do a pretty amazing job of keeping frequently used files in cache. That same memory you are utilising for caching your mod_perl loaded code just in case it is needed, is far better devoted to allowing the system to cache scripts that are used!

        Most web sites--not all I know, but most--have (maybe) two or three dozen oft-used cgis. Now imagine that you had one instance of tiny (or lighttpd or nginx) set up to service each of those cgis, and a reverse proxy to distribute the requests to them (plus a static page server or two, and an image server or two). Each one can handle hundreds if not thousand of concurrent requests. You get fault-tolorance, load distribution etc. And go one step further and have the cgi servers run the single cgi they serve using a fastcgi connection to a matching perl instance.

        Apache, and 'centralisation' in general, serve only to complicate things. With all your eggs in the same basket, finding the bad egg (bugs) is a total PITA--as you are discovering. By keeping individual things separated, you have the opportunity to concentrate your efforts on tuning those scripts that need it. The ones that get hit hardest. If need be, you can substitute a second layer of load balancing for any node and distribute load where needed. And if one script dies catastrophically, only that script is affected. The rest of the site continues oblivious to the problem.

        Monitoring for failures and generating notifications is trivial. And the process of post mortem far easier because only the logging from that particular cgi is in that server's logs.

        Need to add a second (or more) physical server to the mix. T'is easy, just split the individual instances across the machines according to their time/resource usage.

        People seem to have forgotten the *nix philosophy of having each process do one thing and do it well. Programs like Apache that contain everything including the kitchen sink (with 2 1/2 bowls, a spray head and hands free tap, and a waste digester!), load everything, anyone might ever need. But there are probably only a handful of sites that ever use more than half of it.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.