Problem has been solved

Dear monks,

I have some scripts runnning as server processes. The workload has increased, thats why I would like to switch from serial processing to "fork after accept".

To guarantee every child his own DB/RPC Handles, new connections are made in every child process.
Child sockets are closed in the parent, parent sockets are closed in the childs.
sub new_client { my $sock = $main_socket->accept(); my $pid = fork(); unless( defined $pid && $pid == 0 ) { $sock->close; } if ($pid == 0) { $main_socket->close; # close main socket in child #process_request . . } }
Remark: code is derived from Msg.pm (Advanced perl programming) and the codebase for both cases is IDENTICAL except for one sub (to be found in readmore).
Under certain circumstances (reproducible by calling the variant 1 before server start) I end up with bad file decriptors in the child processes, while variant 2 works. accept/fork are done right, but select() on the filehandles for the incoming rpc connection returns EBADF. The parent is still alive and doesn't close any file descriptor it should not.

Warning, lots of stuff to read

package MyStuff::Database; %db_conns = (); sub _DBconnect_internal { my ($dsn, $user, $password, $host, $tag, $options) = @_; if (not exists $db_conns{"$tag:initialized"}) { $db_conns{"$tag:data"}->{conn} = [$dsn, $user, $password, $host]; + $db_conns{"$tag:initialized"} = 1; &Log("DB Handle $tag ready for connect.") if $_debug > 5; return; } #Connecting happens here . . . } ##END OF PACKAGE use MyStuff::Database qw(%db_conns _DBconnect_internal _DoQuery); Variant 1 sub DBConnect { _DBconnect_internal('DBI:mysql:database=test;host=localhost;port=', +'foo', 'bar', 'TestDB', 'mysql'); } . . &DBConnect; &server_start; . . Variant 2 use MyStuff::Database qw(%db_conns _DBconnect_internal _DoQuery); sub DBConnect { $MyStuff::Database::db_conns{"mysql:data"}->{conn}= ['DBI:mysql:data +base=test;host=localhost;port=', 'foo', 'bar']; $MyStuff::Database::db_conns{"mysql:initialized"} = 1; } . . &DBConnect; &server_start; . .

My head is about to explode. Problem is, if I start the server calling the variant 2, everything works fine.
Calling variant 1 leads to EBADF in select();
The strange thing is both variants do the same thing, stuff the connect data into %db_conns in the parent process. NO handles are created or connected at this stage. When calling _DBconnect_internal again in the child process, it actually connects to the datasource using that data. I cant see anything concerning filehandles there.

For compatibility reasons I have to stick with variant 1.

These are the strace excerpts:
Working fork: clone(Process 22290 attached (waiting for parent) Process 22290 resumed (parent 22276 ready) child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, +child_tidptr=0xb7d4e908) = 22290 [pid 22290] close(3 <unfinished ...> [pid 22276] close(4 <unfinished ...> [pid 22290] <... close resumed> ) = 0 [pid 22276] <... close resumed> ) = 0 [pid 22276] select(8, [3], NULL, NULL, NULL <unfinished ...> [pid 22290] open("/usr/share/locale/en_US.ISO-8859-15/LC_MESSAGES/libc +.mo", O_RDONLY) = -1 ENOENT (No such file or directory) [pid 22290] open("/usr/share/locale/en_US.iso885915/LC_MESSAGES/libc.m +o", O_RDONLY) = -1 ENOENT (No such file or directory) [pid 22290] open("/usr/share/locale/en_US/LC_MESSAGES/libc.mo", O_RDON +LY) = -1 ENOENT (No such file or directory) [pid 22290] open("/usr/share/locale/en.ISO-8859-15/LC_MESSAGES/libc.mo +", O_RDONLY) = -1 ENOENT (No such file or directory) [pid 22290] open("/usr/share/locale/en.iso885915/LC_MESSAGES/libc.mo", + O_RDONLY) = -1 ENOENT (No such file or directory) [pid 22290] open("/usr/share/locale/en/LC_MESSAGES/libc.mo", O_RDONLY) + = -1 ENOENT (No such file or directory) [pid 22290] stat64("/dev/log", {st_mode=S_IFSOCK|0666, st_size=0, ...} +) = 0 [pid 22290] socket(PF_FILE, SOCK_STREAM, 0) = 3 [pid 22290] ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbffb3cb8) = -1 E +INVAL (Invalid argument) [pid 22290] _llseek(3, 0, 0xbffb3cf0, SEEK_CUR) = -1 ESPIPE (Illegal s +eek) [pid 22290] ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbffb3cb8) = -1 E +INVAL (Invalid argument) [pid 22290] _llseek(3, 0, 0xbffb3cf0, SEEK_CUR) = -1 ESPIPE (Illegal s +eek) [pid 22290] fcntl64(3, F_SETFD, FD_CLOEXEC) = 0 [pid 22290] connect(3, {sa_family=AF_FILE, path="/dev/log"}, 110) = 0 [pid 22290] time(NULL) = 1184077806 [pid 22290] time(NULL) = 1184077806 [pid 22290] stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=22 +95, ...}) = 0 [pid 22290] stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=22 +95, ...}) = 0 [pid 22290] stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=22 +95, ...}) = 0 [pid 22290] stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=22 +95, ...}) = 0 [pid 22290] select(8, [3], NULL, [3], {0, 0}) = 0 (Timeout) [pid 22290] write(3, "<13>Jul 10 16:30:06 rpcsrv[22290"..., 66) = 66 [pid 22290] time(NULL) = 1184077806 [pid 22290] getpeername(4, {sa_family=AF_INET, sin_port=htons(51130), +sin_addr=inet_addr("127.0.0.1")}, [16]) = 0 [pid 22290] getpeername(4, {sa_family=AF_INET, sin_port=htons(51130), +sin_addr=inet_addr("127.0.0.1")}, [16]) = 0 [pid 22290] select(8, [3 4], NULL, NULL, NULL) = 1 (in [4]) [pid 22290] read(4, "\0\0\0$", 4) = 4 [pid 22290] read(4, "FrT;@5|$1|>$1|1$1|s$9|main::inc$"..., 36) = 36 . . . .

This is from another process using DBConnect_not_ok before server start
Not working clone(Process 22302 attached (waiting for parent) Process 22302 resumed (parent 22300 ready) child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, +child_tidptr=0xb7d1b908) = 22302 [pid 22302] close(4 <unfinished ...> [pid 22300] close(5 <unfinished ...> [pid 22302] <... close resumed> ) = 0 [pid 22300] <... close resumed> ) = 0 [pid 22300] select(8, [4], NULL, NULL, NULL <unfinished ...> [pid 22302] time(NULL) = 1184077872 [pid 22302] time(NULL) = 1184077872 [pid 22302] stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=22 +95, ...}) = 0 [pid 22302] stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=22 +95, ...}) = 0 [pid 22302] stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=22 +95, ...}) = 0 [pid 22302] stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=22 +95, ...}) = 0 [pid 22302] select(8, [3], NULL, [3], {0, 0}) = 0 (Timeout) [pid 22302] write(3, "<13>Jul 10 16:31:12 rpcsrv[22302"..., 66) = 66 [pid 22302] time(NULL) = 1184077872 [pid 22302] getpeername(5, {sa_family=AF_INET, sin_port=htons(51131), +sin_addr=inet_addr("127.0.0.1")}, [16]) = 0 [pid 22302] getpeername(5, {sa_family=AF_INET, sin_port=htons(51131), +sin_addr=inet_addr("127.0.0.1")}, [16]) = 0 [pid 22302] select(8, [4 5], NULL, NULL, NULL) = -1 EBADF (Bad file de +scriptor) . . .
Certain (failing) ioctl calls are missing in the second strace log, thats what I can see from that. Why is beyond my actual understanding.
I tried to find a solution for almost a week now, Im out of options. Thanx for any hint on how to solve that.

Best regards,
gnork

cat /dev/world | perl -e "(/(^.*? \?) 42\!/) && (print $1))"
errors->(c)

In reply to Mysterious bad file descriptor after fork by gnork

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.