in reply to Re^3: system command - OK on 32 bit, fails on 64 bit Linux - why?
in thread system command - OK on 32 bit, fails on 64 bit Linux - why?

The array sizes you have imply a total allocation of around 2Gb on a 32-bit system, and around 2.5Gb on a 64-bit system. This might be some magical number that is causing the fork to fail. How much virtual memory do the two servers have? Try running the code under strace, and see if there are any interesting system call errors:
$ strace -f -o /tmp/strace.out perl foo.pl $ grep '= -1 E' /tmp/strace.out
Also, for debugging purposes, please just call date the one time and print the full unadulterated value of $?, rather than three times with three separate snippets, e.g.
system("date"); printf "\$?=0x%x \$!=%s\n", $?, $!;

Dave.

Replies are listed 'Best First'.
Re^5: system command - OK on 32 bit, fails on 64 bit Linux - why?
by geep999 (Novice) on Feb 18, 2010 at 10:06 UTC
    Hi and thanks,
    grep '= -1 E' /tmp/strace.out shows this as the final message:
    5215  clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fc213c1b780) = -1 ENOMEM (Cannot allocate memory)
    $? and $! report:
    Before array setup: $?=0x0 $!=Operation not permitted
    After array setup: $?=0x0 $!=Cannot allocate memory

    For details of free memory via free -m see below. Also ulimit -a

    So my problem appears to be memory. I have 4Gb RAM and 2.4 Gb swap. I thought that on 64 bit the array would be swapped out to allow the system command to run. Seems it isn't.

    On 32 bit there is just enough memory to allow the system command to run.

    I do have in mind a crude workaround for the problem, involving a simple daemon to replace the system call to run PovRay, but I am still interested to understand why the system thinks it can't allocate memory when there's Gbytes of swap available.

    Cheers, Peter

    free -m before and after array setup
    64 bit
    Before:
    total used free shared buffers cached Mem: 3948 1202 2746 0 63 418 -/+ buffers/cache: 720 3228 Swap: 2392 0 2392
    After:
    total used free shared buffers cached Mem: 3948 3931 17 0 3 93 -/+ buffers/cache: 3834 114 Swap: 2392 51 2341

    32 bit
    Before:
    total used free shared buffers cached Mem: 3549 954 2594 0 77 362 -/+ buffers/cache: 514 3034 Swap: 2392 0 2392
    After:
    total used free shared buffers cached Mem: 3549 3437 111 0 81 359 -/+ buffers/cache: 2996 553 Swap: 2392 0 2392
    64 bit:
    ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited file size (blocks, -f) unlimited pending signals (-i) 36864 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 36864 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited
      There are two issues here: an OS one, and a perl one. First, why the OS won't allow the fork. Linux usually allows a certain amount of overcommitting of swap, which would normally safely allow the fork followed by exec to run the date command. My back-of-the envelope calculations on the size of the perl process seems to have been low - from your numbers, it looks like the 32-bit perl is 2482Mb, while the 64-bit one is 3165Mb. This means the 64-bit one can't fork without overcommitting. Perhaps you've disabled overcommitting on your systems? Mine show the following, which are the defaults:
      # cat /proc/sys/vm/overcommit_memory 0 # cat /proc/sys/vm/overcommit_ratio 50

      The perl issue is why the ENOMEM from the clone sys call isn't getting reported by the perl fork function. This smells like a bug.

      Dave.

        This smells like a bug

        Definitely. You seem to have a good grasp on it. Are you going to report it?