in reply to Re^4: threads->new falling in a heap.
in thread threads->new falling in a heap.

Add to that the fault I'm chasing is inetermittant

From experience, using a debugger is very unlikely to allow you to track down intermittent threading faults anyway. The very act of running code under debugger control inevitably changes the dynamics.

If you are exceptionally lucky, the change will make the problem occur more frequently and reliably, but in 20+ years of working with threaded code that happened exactly once. On every other occasion, even semi-reliable bugs would fail to manifest themselves at all under the debugger and reappear as soon as it was taken out of the picture.

In my experience, the first thing to do with intermediate bugs is make them reproducible. That means running the code in a controlled (repeatable) way in a production-like environment until the problem is reliably reproducible.

The next thing to do, is track down what is going on, and where, when the problem occurs. And that always means adding wide-spread, low-granularity, low-overhead logging.

Often the best logging mechanism is a very simple, unformatted output (ex.just the current thread and line number) using the UDP sendto() function to a local port. On the other end of that port you have a program that simply listens to the port and logs the data to disk (preferably one not used by the monitored program; a USB thumb-drive is ideal!), in a tight loop with no attempt at interpreting it.

This logging can be added to the code at (say) the entry and exit points of the major functions, with little or no impact on the performance or dynamics of the code being monitored. Once the error occurs, you can inspect that log to work out where each thread was when it occurred. You can then remove most of the logging and increase the granularity within those functions active when the bug manifests. Re-run and gradually 'zoom in' on the specific circumstances that cause the bug to arise.

It may sound somewhat crude and slow, but with a little practice (and some well-crafted macros if you are using C), it is very effective. Once you know where each thread is (and therefore what it is doing) when the bug occurs, it is usually obvious where the problem lies.

If your code is not proprietary, I'd be willing to take a look. No promises -- I probably couldn't run it here -- but I might spot something.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

The start of some sanity?

Replies are listed 'Best First'.
Re^6: threads->new falling in a heap.
by Steve_BZ (Chaplain) on May 11, 2012 at 20:09 UTC

    Hi BrowserUK,

    First of all let me thank you for you very kind offer of looking at the code.

    In fact, I have isolated the problem using the steps you suggest. What I thought was an intermittant fault is now 100% reproduceable and understood. Really it could be classified as a 'design' problem. While playing a live-streamed video which is being recorded at the same time, if I pause and take a snap of the position, the background process runs off and takes the snap (using the afore-mentioned 'threads'). Because the snap is from the end of the file the frame has not been written completely to the file. If I subtract 1 frame from the position it works fine - but a lot can happen in a frame and it's not an ideal solution. Alternatively, and this is what I will do for the time being, I can disable the snap button. The correct answer is to buffer the command until play has resumed or alternatively stopped altogether and the final frame of the video is written to the file and the file is closed.

    If the video is not paused, then also the frame has been written by the time 'threads' gets to it and also, all is fine.

    However I am left feeling that my IDE which works fine most of the time, has a blind-spot and resolving issues in this blind spot is much more time-consuming than I would like.

    Thank you again for your support.

    Regards

    Steve

      However I am left feeling that my IDE which works fine most of the time, has a blind-spot ...

      I installed a trial copy of Komodo and started it. Even before I loaded any files, it was already running 32 kernel threads to just sit there doing nothing.

      And whilst I cannot be certain, it looks like at least one of those threads and possibly more could be a Java "green threads" scheduler.

      I loaded a (single) perl file and that increased to 43 threads.

      Then I took a look at the dll's it loads:

      C-runtime. C++ Runtime, Python runtime, Java-runtime, perl development kit, XUL, OLE, .NET, Active Directory, ActiveX, LDAP; audio drivers, Multimedia drivers, Internet Explorer runtime, 3 different graphics frameworks, 3 different encryption libraries, Remote access dialer, Remote procedure calls; DNS, DHCP, every sockets library known to man; sqlite, MSSQL; ???

      Good grief. Are they trying to compete with EMACS in the operating-system-disguised-as-an-editor Olympiad?

      Frankly, I don't know whether to condemn the authors or just stand with my mouth open and slow hand-clap them in sheer amazement.

      Anyway, glad you solved your problem :)


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      The start of some sanity?

        IE and Mozilla in the same process. With Java and ActiveX/COM. Its jaw dropping, but what do developers care, they all got 16 core opterons with 32 GB of RAM as their desktops at work.