in reply to Problem opening file

No need to get complicated. If you don't want the script to die, don't use die.

my $lockfile = $destDir."loadqueue.lck"; while (!open my $fhLock, ">", $lockfile) { print "Failed to open $lockfile: $!"; sleep(3); ### Changed from 3000 to 3 (parameter is seconds, +not milliseconds!) }

As to why it's failing -- without more information, hard to tell. My guess is that in the process of opening a file -- usually an atomic operation -- the thread may have stepped on something another thread was doing (or vice versa), and failed without a highly visible cause.

Thus, my first stab at a workaround was to let it breathe 3 seconds (or whatever is appropriate for the environment) and try again.

Edit: Changed sleep(3000)to sleep(3)(stupid human tricks when switching back and forth between languages)

Replies are listed 'Best First'.
Re^2: Problem opening file
by kcott (Archbishop) on Jul 08, 2015 at 17:03 UTC
    "... let it breathe 3 seconds ..."

    From sleep:

    "Causes the script to sleep for (integer) EXPR seconds, ..."

    You're letting it breathe for 50 minutes. :-)

    [Also sleep(3000) in the code in your second response.]

    -- Ken

      D'oh!

      Fixing...

        Please mark changes in prior postings where you make the change.

        Indicating that you've changed something in a new response (ie, the parent of this one) leaves those (like me, duh!) wondering if the eyes have failed, since there's no evidence of kcott's objection in the node to which he replied.

        UPDATE: I must havecaught you adding the "Edit" note. All's well now, but I think I'll leave the above as is (a style guideline) for newbies who stumble onto this.

Re^2: Problem opening file
by SimonPratt (Friar) on Jul 09, 2015 at 09:38 UTC
    "As to why it's failing -- without more information, hard to tell. My guess is that in the process of opening a file -- usually an atomic operation -- the thread may have stepped on something another thread was doing (or vice versa), and failed without a highly visible cause."

    I agree with your comment in principle, however the error returned by open indicates something a bit more basic is occurring. Its as though the arguments I am passing to open aren't being received (at least that is how I interpret the error).

    I've also done everything I can think of to separate the threads and prevent any toe stomping (such as not share'ing variables, using Thread::Queue for IPC, having a single control to ensure work units that interact with each other can only be passed to the same thread). Although thinking about it now, it is entirely possible that an external process is attempting something it shouldn't be doing. Thanks for your comments and suggestions so far, I'll go have a look at what else is running on the server.

      Its as though the arguments I am passing to open aren't being received

      I agree, and that's my point.

      If very far under the covers, these threads are sharing some common space for performing the atomic openoperation, there could be a race condition where one starts to set up up the parameter block, but before the system call is triggered, another thread goes to update the same block of memory, resulting in invalid parameters being in that space once the service call starts to try to read it.

      No doubt one of many possible explanations, but if that's the case, two things come to mind:

      1. A workaround involving retries could buy you a quick and effective, if less than completely graceful, solution, and;
      2. The correct fix, from an engineering perspective, might involve poring over the code for the threads as well as the I/O modules/features to see if there is a prudent place to establish a semaphore or similar gating mechanism to eliminate the race condition.

        This suggestion scares the hell out of me, to be honest. You seem to be implying that threads is inherently broken!

        Remember, I'm not using any shared variables at all other than Thread::Queue