Re: panic: COND_DESTROY(6)
by BrowserUk (Patriarch) on Jan 26, 2012 at 14:35 UTC
|
You could provide a little more info perhaps?
- OS?
- Perl version?
- threads version?
- threads::shared version?
Am i thinking right that the "6" is a TID of thread that crashed?
More likely the number is the numeric error code. On Windows that would be "invalid handle" returned from the attempt to close the semaphore associated with a threads::shared condition variable:
#define COND_DESTROY(c) \
STMT_START { \
(c)->waiters = 0; \
if (CloseHandle((c)->sem) == 0) \
Perl_croak_nocontext("panic: COND_DESTROY (%ld)",GetLastError(
+)); \
} STMT_END
Of course it might mean something different on other OSes.
Your best bet would be to post the code, assuming it isn't too large or proprietary or require too much in the way of unique set-up.
If it is, then try to reduce as much as possible whilst still having the error occur. (I appreciate that can be difficult with transient errors like this.) But it will be very hard to advise without sight of the code in question.
If it is the invalid handle problem, the most likely cause is the handle being closed twice, but working out how that might occur will require sight of the code.
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
| [reply] [d/l] |
|
|
I've just updated my post with os info and such.
Tommorow i'll refer to the rest of your post.
| [reply] |
|
|
| [reply] |
|
|
|
|
I can't really put the code here since i'm bounded by my company's policy.
But maybe there is another way around this. Someone suggested that this may be related to semaphores in my code. But i don't use semaphores, only locks (i'm locking Object-InsideOut type object). I assume that perlish locks are implemented using low level semaphores?
| [reply] |
|
|
typedef struct win32_cond { LONG waiters; HANDLE sem; } perl_cond;
When a condition variable is garbage collected (DESTROYed), the semaphore handle is closed, then the memory for the struct is freed. The panic you are seeing is occurring when the attempt to close the semaphore handle fails. The only way I can see this happening is if there is a second attempt to DESTROY a condition variable that has previously been destroyed.
That would put the root cause of problem outside of the realms of your code firmly in the auspices of Perl/threads::shared. But that doesn't help you solve or work around your problem; nor does it give the maintainers any clue as to the circumstances under which the bug is occurring.
The only long-term viable way forward that I see, is for you to remove as much of the proprietary code and dependencies from the code as you can, whilst retaining the flow that causes the bug to occur, and then post that. Odds are that this would allow us to find a workaround that you could fold back into your proprietary code; and give the maintainers a testcase on which to base a future fix.
Looking at the change history for threads::shared, there were changes relating to shared object destruction in the latest build (which you are using), and earlier in version 1.33. My first step would be to downgrade thread::shared on your installation to version 1.32 and see if that 'fixes' the problem.
But for a long term fix, you should really consider trying to come up with a cut-down testcase for the problem, that you have permission to publish. (The smaller the better!).
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
| [reply] [d/l] |
|
|
|
|
|
|
|
I can't really put the code here since i'm bounded by my company's policy.
Could you reduce your code to a minimal version that demonstrates the problem, is short enough to post, and contains no proprietary information?
For example, about a week ago, I also posted a question relating to threads. The initial problem I saw was in a big and secret perl script, that I would definitely not be allowed to post, but I reduced the script by removing & commenting out big blocks of code until I was left with a 25 line script that demonstrated the problem.
That script contains nothing secret so there is no problem posting it, and also it is much shorter so it is easy for our fellow monks to understand the problem.
| [reply] |
Re: panic: COND_DESTROY(6)
by choroba (Cardinal) on Jan 26, 2012 at 14:37 UTC
|
Is the application written in Perl? If yes, can you show how threads are handled in the code? | [reply] |
Re: panic: COND_DESTROY(6)
by locked_user sundialsvc4 (Abbot) on Jan 26, 2012 at 14:53 UTC
|
Superficial googling suggests that cond_destroy is a Unix system-call which, per the documentation, is expected to return zero, any nonzero value indicating some kind of error. We may presume (guess...) is instead returning 6. The document unfortunately does not then go on to give a list of them. Perhaps you can chase-down the OS source code of the call-handler or otherwise find a detailed explanation of each return-code possibility, but perhaps the discussion alone in the man page will provoke some worthy ideas. The fact that the problem occurs rarely-but-consistently pretty much establishes that what you have is some kind of rare-but-possible race condition bug in your code, which is “a one-in-a-million chance, but it is executed millions of times.” Sux, but that’s always the way that such things are. You didn’t need all that extra hair, anyway . . .
| |
|
|
| [reply] |
|
|
I admit that it is becoming quite entertaining to provoke you. It’s always so consistently successful... :->
I ring the little bell, you jump. Every time.
| |
|
|
| A reply falls below the community's threshold of quality. You may see it by logging in. |