If you are using threads, do as little as possible that consumes memory in your main thread, that includes initialising data, before you spawn your threads. Here is timing and memory usage stats from two consecutive runs of a simple threaded script. The only difference between them is the relative position of two lines of code:
c:\test>junk Image Name PID Session Name Session# Mem Usag +e ========================= ====== ================ ======== =========== += tperl.exe 10172 0 64,840 +K Taken 3.278383 seconds c:\test>junk Image Name PID Session Name Session# Mem Usag +e ========================= ====== ================ ======== =========== += tperl.exe 2924 0 173,516 +K Taken 8.761321 seconds
For the first run, the code looked like this:
#! perl -slw use strict; use threads; use Time::HiRes qw[ time ]; sub simplesub { sleep 10, return 1 } my $start = time; my @threads = map{ threads->create( \&simplesub ) } 1 .. 10; my @array = 0 .. 1e5; my %hash = 1 .. 1e5; system qq[tasklist /fi "pid eq $$"]; printf "Taken %f seconds", time() - $start; $_->join for @threads;
For the second run, like this:
#! perl -slw use strict; use threads; use Time::HiRes qw[ time ]; sub simplesub { sleep 10, return 1 } my $start = time; my @array = 0 .. 1e5; my %hash = 1 .. 1e5; my @threads = map{ threads->create( \&simplesub ) } 1 .. 10; system qq[tasklist /fi "pid eq $$"]; printf "Taken %f seconds", time() - $start; $_->join for @threads;
So, another secret to (somewhat) lighter threads is to ensure that you spawn your threads early in the program before you generate lots of data structures in your main thread. Everything that exists in your main threads memory at the time of spawn, (including everything created by all the packages you have used ( physically before or after the point of spawn!)), will be cloned wholesale into the memory of each thread you spawn!
That has the downside that you don't always want to spawn your threads right at the start of your code as you often don't have everything they need at that point. That in turn, requires that you arrange for your threads to wait for the information they require, and some method of passing that information to them at some later point once it is available. And that introduces the complications of queues and shared memory and synchronisation.
What I've been looking for for a while now is a simple interface to a mechanism that allows me to spawn my threads early, with new, clean, uncloned, interpreters, in a suspended state and then 'resume' them, passing any parameters they require using a simple, clean interface.
my( $Xthread ) = threads->create( { suspended => 1 }, \&Xthread ); my( $Ythread ) = threads->create( { suspended => 1 }, \%Ythread ); ... Do other stuff that gets me the parameters for X $Xthread->resume( $arg1, $arg2 ); ... Generate/fetch/calculate args for Y $Ythread->resume( $Yarg1, $Yarg2 ); ... tum te tum my( @Yresults ) = $Ythread->join; ... my( @Xresults ) = $Xthread->join;
If anyone has suggestions for how to go about doing this?
If the threads could be 're-resumed' with different parameters that would be even better.
In reply to threads: spawn early to avoid the crush. by BrowserUk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |