I'll probably have to add a process table that stores the child pids
and have to use waitpid - not immideately, but after some time has
passed - which is a problem, because the running time of the children
varies from few microseconds to several minutes. If I use waitpid too
early, my parent process will get blocked
Check out the documentation for waitpid
. You can tell waitpid
to NOT block (i.e., return immediately if there are no children to reap) using the WNOHANG
option. Moreover, you don't need to give waitpid
a PID. You can specify -1
, and it will wait for any child. So calling waitpid
as below fits your no-blocking constraint and no-saving-pids constraint:
waitpid( -1, &status, WNOHANG );
If you really don't want to properly handle process creation, then you can give the reaping responsibility to init
by forking twice, reaping the child, and giving the exec
to the grandchild:
pid_t temp_pid, child_pid;
temp_pid = fork();
if( temp_pid == 0 ){
child_pid = fork();
if( child_pid == 0 ){
// exec()
error( EXIT_FAILURE, errno, "failed to exec :(" );
} else if( child_pid < 0 ){
error( EXIT_FAILURE, errno, "failed to fork :(" );
}
exit( EXIT_SUCCESS );
} else if( temp_pid < 0 ){
error( EXIT_FAILURE, errno, "failed to fork :(" );
} else {
wait( temp_pid );
}
In the above code snippet, the child process forks its own child, immediately exists, and then is immediately reaped by the parent. The grandchild is orphaned, adopted by init
, and will be reaped automatically.
Why does Linux keep zombies at all? Why do I have to wait for my
children? Is this to enforce discipline on parent processes? In
decades of using Linux I have never got anything useful out of zombie
processes, I don't quite get the usefulness of zombies as a "feature".
If the answer is that parent processes need to have a way to find out
what happened to their children, then for god's sake there is no
reason to count zombies as normal processes and forbid the creation of
non-zombie processes just because there are too many zombies.
How else do you propose one may efficiently retrieve the exit code of a process? The problem is that the mapping of PID <=> exit code (et al.) must be one to one. If the kernel released the PID of a process as soon as it exits, reaped or not, and then a new process inherits that same PID and exits, how would you handle storing two codes for one PID? How would an interested process retrieve the exit code for the first process? Don't assume that no one cares about exit codes simply because you don't. What you consider to be a nuisance/bug is widely considered useful and clean.
On the system I'm currently developing for I can only spawn 400 to 500
processes before everything grinds to halt (it's a badly maintained
CentOS system running on the cheapest VServer I could find - but still
400 zombies are less than a few kB of information)
Something about making a widely accepted kernel behavior a scapegoat for what are clearly frustrations over a badly-maintained/cheap system doesn't seem right.
Typically, your maximum number of processes is limited only by your memory. You can see your limit with:
cat /proc/sys/kernel/threads-max