Re: Suggestions on differentiating child processes
by Eimi Metamorphoumai (Deacon) on Nov 12, 2004 at 15:49 UTC
|
How do the children get which data set to work on? Probably your best approach is to have the parent assign the work the child is going to do. Then you can store the child's pid (the result of the fork) and the workload assigned to it...somewhere (either to a file, or have the program output it, or something--what's appropriate depends a lot on the details of your program).
Also, you might see what setting $0 does on your system, as it might be a quick and easy way to differentiate the children in a ps listing, though it only works on some OS's. | [reply] [d/l] |
|
|
Your first suggestion is what I was considering, as outlined in my post.
Unfortunately (for this I guess), I am using Solaris and the $0 data is obtained from the /proc filesystem by ps and cannot be changed from within the program.
| [reply] |
Re: Suggestions on differentiating child processes
by fergal (Chaplain) on Nov 12, 2004 at 17:03 UTC
|
Not a great suggestion but you could do something like
$SIG{__USR1__} = sub {
open(my $f, ">/tmp/taskinfo");
print $f "$$'s current task is $TASK\n";
close($f);
}
then just make sure each process sets $TASK correctly and do
> kill -USR1 12345
> cat /tmp/taskinfo
12345's current task is making breakfast
Could be dangerous on perls where signals are not safe (pre 5.8 I think)
| [reply] [d/l] [select] |
|
|
I agree that's another good one. It's something a number
of standard *nix processes do -- dump run-time info.
on receipt of a specific signal. I also worked on a system
that had logging similar to what I use now (level based
logging with the ability to set that globally or per
package). We added a signal catcher that toggled the highest
level of logging off and on globally. That way we could
trace what was going on at specific points in time as
needed.
I've also thought that if you have some sort of
fixed settings whose values change as you run, that
a good model would be to make that accessible via
shared memory while the process is running, where the
shared data drives a dashboard sort of view of what's going
on. You could also use a database for that sort of thing
although that might add a bit of overhead. If properly
set up, shared memory would also offer the possibility of
making the settings externally accessible; e.g., you might
be able to set a trace level in the shared memory that
drives what resolution of data you get to see in the shared
data area.
| [reply] |
|
|
Very good idea! Thanks. This might be the simplest idea. Nice and creative, thanks.
| [reply] |
Re: Suggestions on differentiating child processes
by tmoertel (Chaplain) on Nov 12, 2004 at 17:52 UTC
|
Have each child write its status to a pid-named file in a status
directory. When its work is complete, have it delete its file.
If you use File::Temp, the cleanup is automatic.
use File::Temp;
use constant STATUS_DIR => '/tmp/status-dir';
{
my $fh; # hidden from outer code
# status is logged only when you have created
# the status dir by hand; this lets you turn off
# logging by deleting the status dir
if ( -d STATUS_DIR ) {
$fh = new File::Temp( template => "$$-XXXXXX",
dir => STATUS_DIR,
suffix => ".txt",
unlink => 1 )
or die "can't open temp file: $!";
$fh->autoflush(1);
}
sub log_status {
print $fh "$$ - ", scalar localtime, " - ", @_, "\n"
if $fh;
}
sub close_status {
$fh->close if $fh;
}
}
Then your work code can log its progress:
# here is the meat of our worker code
log_status("worker starting; config=blah...");
# ... work ...
log_status("worker starting stage two");
# ... more work ...
log_status("worker starting stage three");
sleep 30; # it's a long one
# finally! whew, that was hard work :)
log_status("worker exiting");
close_status();
If a worker bee gets hung up, you can observe its status
from the log that corresponds to its pid. For example,
I justed fired one up with pid 14416. Let's see what
it is doing:
$ cat /tmp/status-dir/14416-UXL0uq.txt
14416 - Fri Nov 12 12:40:06 2004 - worker starting; config=blah...
14416 - Fri Nov 12 12:40:06 2004 - worker starting stage two
14416 - Fri Nov 12 12:40:06 2004 - worker starting stage three
Ah, it's that darn stage-three work! It always takes too
long.
Also, since the log files are automatically deleted upon completion,
you can see what tasks are still running by listing the contents
of the status-log directory.
Cheers, Tom
| [reply] [d/l] [select] |
|
|
Very good idea as well, thank you. This might work quite well. I'll have to give it a go.
Thanks much.
Chad.
| [reply] |
Re: Suggestions on differentiating child processes
by steves (Curate) on Nov 12, 2004 at 16:08 UTC
|
Can you use Proc::ProcessTable
to get extended ps like data to find what you need?
I personally like to use unbuffered logs that log enough
data to make it easy to track and watch things. That way
I also have a historical reference I can look back on to
investigate problems. The framework I'm using now (self
built) has standard logging and statistics for each process
that you mostly just get by using the packages of the
framework. We have verbose and debug levels
that can be used to control the level and type of output.
Those can be set either globally or at the package level.
There are command line options that are tied to those
settings.
| [reply] |
|
|
Proc::ProcessTable offers extensive searching of process relatd information, but you still cannot determine what an individual process may be doing at any one time.
As for your second note, and external source, you are the second person to recommend this (third if I count myself) and it looks like I may have to go this route.
Thanks.
| [reply] |
Re: Suggestions on differentiating child processes
by iburrell (Chaplain) on Nov 12, 2004 at 18:01 UTC
|
It is possible to change the name of process listed by ps by changing the $0 variable.
For example, at work we have a script that runs modules specified on the command line. This is so the modules can be tested. It isn't useful to see that script listed everywhere so it changes the $0 to the module it is running.
Another good example is postgres. The server processes change their name to include the database, user, client address, and command. Makes it very easy to see which command is taking up all the load and tracking it back to a client program.
| [reply] |
|
|
Changing $0 only works on systems that retreive ps information from the stack, on Solaris and Linux this information is retreived from the /proc filesystem, which you cannot change.
Thanks though.
| [reply] |
Re: Suggestions on differentiating child processes
by fglock (Vicar) on Nov 12, 2004 at 17:14 UTC
|
You can use a dummy parameter with a unique identifier, like:
perl -V:1234 -w -e ' $a = <>; print $a; '
(this example would print some garbage to STOUT, which may be undesirable)
It shows in ps as:
perl -V:1234 -w ...
| [reply] [d/l] [select] |
|
|
perl -V:1234 -w ...
| [reply] [d/l] |