Can't figure out whats the issue here.
Got a _huge_ mod_perl application, running apache worker mpm with a single process and approx. ~40 threads.
The threads shares memory between themselves (threads::share, and some 300mb-500mb is not unusual)
locks are used to block read and write, but those are just 5% of the execution time.
1) Setting apache to 1 process and 1 thread will result in whats marked with
"seriell" in the image below. One call to my testpage takes 0.8s. 10 calls gives a
total of 8.5s before last is completed.
2) The multithreaded version with 1 process and 40 threads results in chaos
(marked with "parallell" in the image).
calling the testpage gives 0.8s, calling with 10 concurring gives the first replay
after 15s and around 25s in the last 5 arrives.
Any clue what can cause the high %system load while running concurrent?
is there a good way to decode the %system load?
its neither I/O, swapping or disk write afaik - since iosstat came out clean
CPU graph over time:
https://frost.denacode.se/pub/tmp/serial_vs_parallel.png
tx!