How do you prevent a file descriptor from being copy-inherited across fork()
system calls (without closing it, of course)?
I am looking for a way to mark a single file descriptor as NOT to be (copy-)inherited by children at fork()
, something like a FD_CLOEXEC-like hack but for forks (so a FD_DONTINHERIT feature if you like). Anybody did this? Or looked into this and has a hint for me to start with?
Thank you
UPDATE:
I could use libc's __register_atfork
__register_atfork(NULL, NULL, fdcleaner, NULL)
to close the fds in child just before fork()
returns. However, the FDs are still being copied so this sounds like a silly hack to me. Question is how to skip the dup()
-ing in child of unneeded FDs.
I'm thinking of some scenarios when a fcntl(fd, F_SETFL, F_DONTINHERIT)
would be needed:
fork()
will copy an event FD (e.g. epoll()
); sometimes this isn't wanted, for example FreeBSD is marking the kqueue() event FD as being of a KQUEUE_TYPE and these types of FDs won't be copied across forks (the kqueue FDs are skipped explicitly from being copied, if one wants to use it from a child it must fork with shared FD table)
fork()
will copy 100k unneeded FDs to fork a child for doing some CPU-intensive tasks (suppose the need for a fork()
is probabilistically very low and programmer won't want to maintain a pool of children for something that normally wouldn't happen)
Some descriptors we want to be copied (0, 1, 2), some (most of them?) not. I think full FD table duping is here for historic reasons but I am probably wrong.
How silly does this sound:
- patch
fcntl()
to support the dontinherit flag on file descriptors (not sure if the flag should be kept per-FD or in a FD table fd_set, like the close-on-exec flags are being kept
- modify
dup_fd()
in kernel to skip copying of dontinherit FDs, same as FreeBSD does for kq FDs
consider the program
#include <stdio.h>
#include <unistd.h>
#include <err.h>
#include <stdlib.h>
#include <fcntl.h>
#include <time.h>
static int fds[NUMFDS];
clock_t t1;
static void cleanup(int i)
{
while(i-- >= 0) close(fds[i]);
}
void clk_start(void)
{
t1 = clock();
}
void clk_end(void)
{
double tix = (double)clock() - t1;
double sex = tix/CLOCKS_PER_SEC;
printf("fork_cost(%d fds)=%fticks(%f seconds)
",
NUMFDS,tix,sex);
}
int main(int argc, char **argv)
{
pid_t pid;
int i;
__register_atfork(clk_start,clk_end,NULL,NULL);
for (i = 0; i < NUMFDS; i++) {
fds[i] = open("/dev/null",O_RDONLY);
if (fds[i] == -1) {
cleanup(i);
errx(EXIT_FAILURE,"open_fds:");
}
}
t1 = clock();
pid = fork();
if (pid < 0) {
errx(EXIT_FAILURE,"fork:");
}
if (pid == 0) {
cleanup(NUMFDS);
exit(0);
} else {
wait(&i);
cleanup(NUMFDS);
}
exit(0);
return 0;
}
of course, can't consider this a real bench but anyhow:
root@pinkpony:/home/cia/dev/kqueue# time ./forkit
fork_cost(100 fds)=0.000000ticks(0.000000 seconds)
real 0m0.004s
user 0m0.000s
sys 0m0.000s
root@pinkpony:/home/cia/dev/kqueue# gcc -DNUMFDS=100000 -o forkit forkit.c
root@pinkpony:/home/cia/dev/kqueue# time ./forkit
fork_cost(100000 fds)=10000.000000ticks(0.010000 seconds)
real 0m0.287s
user 0m0.010s
sys 0m0.240s
root@pinkpony:/home/cia/dev/kqueue# gcc -DNUMFDS=100 -o forkit forkit.c
root@pinkpony:/home/cia/dev/kqueue# time ./forkit
fork_cost(100 fds)=0.000000ticks(0.000000 seconds)
real 0m0.004s
user 0m0.000s
sys 0m0.000s
forkit ran on a Dell Inspiron 1520 Intel(R) Core(TM)2 Duo CPU T7500 @ 2.20GHz with 4GB RAM; average_load=0.00
See Question&Answers more detail:
os