What are the pictures like for the page faults, I/O faults, and resident status of your web server when this happens? Are you sure the web server isn't getting the response from your program quickly but then having to page back in from a thrashing disk? | [reply] |
Actually if I run 2 instances of the site...one could be slow while other one would be fast(the fast one is one no one accesses but uses the same source)...so if that would be the case, both would be slow? no?
and yes I tried running multiple instances and load balancing but that didn't help either.
| [reply] |
Try load testing your site. It sounds like what you are saying is that in development / test the server behaves just fine. If this is the case, use a load testing suite to test the number of clients that work well. Use your server logs to determine the max number of clients you are seeing in production, and ramp from your testing values (2-3) up to that number.
If your server is set to only accept a certain number of connections, it will not necessarily show a higher load when clients are "waiting in the wings".
Also, if you have multiple instances, each one has its own connection pool, so you may not see the slowdown on both.
For example, assume that your server is set to accept 10 simultaneous connections. If you have 2-3 clients, you will not see any slowdown. If, however, you have 15 clients make request, each one of them uses keep-alive to request associated graphics (10), js (2), and css (1) files, and each request from a client takes a round trip time of .5 seconds, each client will hold a connection for 7 seconds. So the first 10 clients hold connections for 7 seconds, then the next five get to try. The second set of clients see the 7 second delay, plus their own 7 seconds for their requests, for a total of 14 seconds.
Now imagine that you have 2 requests coming in per second. That is 10 requests (the server max) every 5 seconds, but you are only able to handle 10 every 7 seconds. Your wait queue will start to stack up, and your clients will see longer and longer delays.
When your server sees fewer request per second, it gets a chance to catch up, which is why you sometimes see good performance, and other times you do not see good performance.
| [reply] |
As far as i've seen there isn't really any corruption on the disk, or you mean maxed out disks read/write capacity speed and then causing slowdowns?
| [reply] |
here is a sample.
Page Beginning to End in perl: 0.506701946258545 seconds
Page in IIS Time Taken: 26.063 seconds
| [reply] |