jhazra has asked for the wisdom of the Perl Monks concerning the following question:

Hi folks, I am running a perl(5.6) program on Solaris. The main program opens 4 DB (Sybase::DBlib;) connections and gets some(250k rows of data) data from DB and then forks 6 child processes. Child processes use the data obtained by parent and also they fetch some more data from db by opening their own DB connections(4 per forked process). These forked processes write out 4 files each after doing some processing. What is happening is, once in a while these processes are throwing "Not enough space" error in $! and continuing running. I dont see this error unless I print $!. Most of the file open,close calls are checked with die.
My question is, why is this happening? How can I debug this? Is there a way to pin point this? I probably cant run in -Dm mode as the no of passes is more than 250k * 6.

Here is the system details:
OS : Solaris
CPU : 12 ( i guess)
top shows:


load averages: 8.41, 6.44, 5.58 00:03:06
88 processes: 74 sleeping, 2 running, 12 on cpu
Memory: 8192M real, 1645M free, 10G swap in use, 2518M swap free
PID USERNAME THR PRI NICE SIZE RES STATE TIME CPU COMMAND
1686 jack 1 20 0 1129M 1102M cpu11 0:36 6.84% jack.pl
1685 jack 1 20 0 1129M 655M cpu6 0:35 6.40% jack.pl
1683 jack 1 20 0 1129M 655M cpu1 0:34 5.59% jack.pl
1687 jack 1 0 0 1129M 655M sleep 0:33 5.62% jack.pl
1684 jack 1 0 0 1129M 655M cpu7 0:33 5.45% jack.pl
1682 jack 1 0 0 1129M 655M sleep 0:33 5.15% jack.pl
155 jack 1 40 0 1124M 1116M sleep 10:54 0.58% jack.pl
.....
....


Enough disk space is availble in the system. In the code, I checked there is no array indexing issues involved. Mostly hashes have been used.

appreciate your help.
cheers
Jayanta

Replies are listed 'Best First'.
Re: Error: Not enough space
by ikegami (Patriarch) on Jan 04, 2005 at 22:59 UTC
    When are you checking $!? It's not always safe to check $!. It's only safe to check $! after you've been told there was an error by a command that puts something in $! (i.e. after a system call returns false).
      Hi guys,
      Thanks for your responses.
      Here's the code flow:
      Please note that error does not occur in case I run 3 child process. Even sometimes all 6 run without any error. Find the ulimit output below:
      xxxx.pl:
      ======

      ulimit output

      core file size (blocks) unlimited
      data seg size (kbytes) unlimited
      file size (blocks) unlimited
      open files 1024
      pipe size (512 bytes) 10
      stack size (kbytes) 8192
      cpu time (seconds) unlimited
      max user processes 29995
      virtual memory (kbytes) unlimited

      I ran the program by increasing the ulimit to 16M but no luck!
      System free disk space is significantly high..4-5G, and the files that are being created are quite small in size ( total size of all the files may be 1G or so)
      The way I am checking for error is by following code snippet:
      if ($!) { print "Error occured: $!"; # reset the error so that we can catch next genuine error cond +ition $! = 0; }

      Note that after this piece prints error ( which means the program ran out of memory, program still runs (remaining passes of the loop, other child processes) and there is no error for sometime. Error is generated ( though not always) again after sometime. So if the error was precisely due to some concrete memory issue then once error is generated (momory exhausted) all 6 child processes should fail to proceed and all of them should throw the same error. Which is not happening!!!

      Edited by davido to add code and readmore tags.

        I think the point has already been made. You aren't supposed to look in $! unless an error has just occured. For instance

        print $fh "foo\n" or die "Error writing: $!";

        Since we dont know where your $! testing code occurs, theres any number of possibilities as to how $! can be set. For all we know some part of your code catches an error at some point and leaves $! set. Consider the following program:

        perl -le "print BLAH 'foo' or warn $!; print 'foo'; print $!";

        The error message persists long after its relevent.

        As and aside: please use <code> tags around any code you post. At the very least it will avoid errors like the one at the bottom of your post. ;-)

        HTH

        ---
        demerphq

        ... my $x = db_login( ... ) || die( "Could not open ..." ); my $xx = db_login( ... ) || die( "Could not open ..." ); my $xxx = db_login( ... ) || die( "Could not open ..." ); my $xxxx = db_login( ... ) || die( "Could not open ..." ); my %j = <data from DB> my %jj = <data from DB>; my %jjjj = <data from DB>; my $pid1 = &executeChild( ....... ); my $pid2 = &executeChild( ....... ); my $pid3 = &executeChild( ....... ); my $pid4 = &executeChild( ....... ); my $pid5 = &executeChild( ....... ); my $pid6 = &executeChild( ....... ); ...

        Definitely time to refactor. Or even better yet, start from scratch if time and budget allow for it.

        Check return codes from everything that can fail. In particular, see of fork is returning undef. I bet that's your problem, though it's just my intuition.
Re: Error: Not enough space
by Zaxo (Archbishop) on Jan 04, 2005 at 22:58 UTC

    If you can verify that the error is indeed ENOSPC with,

    use Errno; warn "Yes, it's ENOSPC\n" if ENOSPC == 0+$!;
    then you are definitely running out of drive space. Perhaps you have plenty on the system, but you could be running out or hitting a quota on the particular fs you're writing to.

    This would be easier to diagnose if you showed code.

    After Compline,
    Zaxo

Re: Error: Not enough space
by PodMaster (Abbot) on Jan 04, 2005 at 22:52 UTC
    What is happening is, once in a while these processes are throwing "Not enough space" error in $! and continuing running. I dont see this error unless I print $!.
    So as a rule, you don't check the value of $!? Thats not good.
    My question is, why is this happening?
    Because you're running out of space (due to some kind of system admin imposed limit maybe {ulimit})? Who knows, maybe even
    $! = 12; die "Uh oh: $!"; __END__ Uh oh: Not enough space at - line 2.
    Sounds like you'd benefit from a code review.

    MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
    I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
    ** The third rule of perl club is a statement of fact: pod is sexy.