in reply to Re: Win32::MMF + threads misbehavior
in thread Win32::MMF + threads misbehavior

The primary purpose (as noted in Perl coredump analysis tool ?), is to provide a strace-like capability for running Perl apps. Which means the external strace program (let's call it plstrace) needs to share something with the running script thats being traced. Note that plstrace is completely independent of the script to be traced, except for the ability to peek into the shared area to see what the script is doing at any given moment, and hence threads::shared is not an option for the shared area.

Further, I'd like to be able to support both Win32 and *nix platforms. The most similar solution I can find for those is memory mapped files, via Win32::MMF and Sys::Mmap, respectively. So plstrace, and Devel:STrace map to the same file, w/ Devel::STrace acting a bit like Devel::Dprof, except simpler: just keeping track of the call stack, and updating things in the shared area as things change. I try to minimze the amount of accesses and locks to keep the overhead as minimal as I can (Unfortunately, Win32::MMF does a lot of extra stuff I'd rather it not do in that regard, but in the interest of GOWI, I'll live with it...if I can get Win32::MMF to work).

Now the fun part: my primary need for this is a large multithreaded application which occasionally hangs in one of the threads (apparently caught in an infinite loop). Hence, Devel::STrace needs to dump traces for all the threads in a process. So, thru a series of clever parlor tricks, each thread gets its own region of the mmf to trace its call stack, from which DB::sub() adds and removes entries, and which DB::DB() updates with line numbers and timestamps. And plstrace attaches to the mmf and dumps its contents every so often. And then I eyeball the output when on of my threads goes 100% CPU, et voila I know which thread and where things are going awry.

Note that I'm not doing anything w/ threads::shared and mmf here. I was *hoping* that all that cloning would properly pick up the tie of the mmf scalar I'm using, and I'd just use CLONE() to invalidate the current mmf region and grab a new one for the new thread. And everything just carries merrily on. (And, wonder of wonders it actually works on Linux - FC4 Perl 5.8.6 - ! Tho Sys::Mmap has its own set of bizarre behavior)

But I don't want to stop there...the next step is multiprocess apps and multithreaded-multiprocess apps. One might question my sanity for pursuing multiprocess support, since the user can always separately attach plstrace to each process manually...but being able to see everything as a group seems useful to me, and (theoretically, at least) should work just as well as a single process, multithreaded solution.

Thats why.

Replies are listed 'Best First'.
Re^3: Win32::MMF + threads misbehavior
by BrowserUk (Patriarch) on Apr 06, 2006 at 01:04 UTC

    Given (my) uncertainty about what happens when you mix MMF/threads/ties et al, I'd offer two alternative approaches:

    1. Have the per thread DB::DB() routines log the trace information to a common (queue) and start a separate thread that reads the queue and writes to the MMF.

      You still have the problem of arranging for different processes to write to different areas of that shared memory without collisions, along with synchronisation between processes.

      This way, you remove the in-process contention and the uncertainty of behaviour surrounding having multiple tied interfaces on separate ithreads attempting to juggle access to a single process global resource.

    2. Write the external Strace program as a (threaded) tcp server application and have the DB::DB() routines log directly to it via sockets.

      Each thread can create it's own connection to the external program which avoids adding complexity to the process you are trying to debug. You dodge all the problems associated with synchronisation and conflicts that arise by trying to share global resources between threads through tied interface. It would probably be a lot faster to boot.

      I'd use a queue in the server to coalesce the inputs from the clients into a coherent, ordered whole for saving or presentation.

    I'd go for the latter approach, as I think that debug tools should impose as little complexity and overhead as possible upon the programs thet are debugging, and to my mind, opening and writing to a socket fits that bill quite well.

    Trying to manage allocations of memory and synchronise access to them from multiple threads in multiple (unknown) processes; without creating deadlocks; and without your sync'ing and locking interfering with their own sync'ing and locking--given that you don't know what they might be doing, and indeed you are likely to be trying to help them debug it--just seems like too big a hill to climb.

    Synchronisation of access to memory is the Achilles Heel of threads, and the best way of dealing with it is to avoid doing it whenever possible.

    Beyond that, all I can do is wish you the very best of luck :)


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      I guess we'll have to agree to disagree, as my experience of sockets (or even pipes) vs. shared memory (including of the mmap'ed kind) is very much different than yours (ie, the latter is *very* much simpler and faster, once its setup).

      As to locking, as I mentioned earlier, I do as little of it as possible: each thread gets its own region for a ring buffer, so the only synchronization required is at thread create time in order to allocate a region from the region map created by the root process/thread the first time it gets into DB::DB. Once thats done, no thread (or process) ever gets in the way of another. A simple thread lock (plus a file lock on *nix systems) covers the allocation step (tho admittedly not well on Win32, but fork() on Win32 is an odd duck anyway which I'm happy to ignore for the present).

      The plstrace app doesn't do any locking - it just reads as it needs, checks that the data looks reasonable, and prints it. This isn't a transaction mgr, just a tool to peek at whats going on inside a running app; a little garbage in the stew won't hurt anything.

      The most troubling issue is that this is behaving oddly on Win32, an OS that primarily relies on threads (rather than processes) for concurrency, and on memory mapped files for shared memory. So I'd expect things at the system call level to behave a bit more sensibly. Which leads me to believe that the cloning of the mmap'ed tie() may well be doing the damage.

        The most troubling issue is that this is behaving oddly on Win32, an OS that primarily relies on threads (rather than processes) for concurrency, and on memory mapped files for shared memory.

        I may be misinterpreting you here, so I'll apologise is advance in case I am, but...memory mapped files are not used to share memory between threads. All memory, including MMFiles, is automatically(*) accessible to every thread in the process. The fact that a piece of memory may be backed (or not) by a file is completely transparent to all threads within the process. If you knew that, then no harm done :) If you didn't, it might clear up a misunderstanding.

        (*) It's theoretically possible to allocate memory in one thread with a different set of security attributes to another thread in that same process and create the situation where that memory would be inaccessible to the other thread...but you'd have to work at it to create the situation.

        Which leads me to believe that the cloning of the mmap'ed tie() may well be doing the damage.

        That's part of what I've been trying to get at--but only a part.

        Sharing tied vars through the cloning process suffers the same problems as sharing objects that way. After all they are basically the same thing with a predefined and restricted set of methods. Whilst you can get away with sharing cloned objects between threads (by reblessing), because the cloned object handles act as proxies for the real data, in the case of Win32::MMF objects (and by implication Win32::MMF::Sharable ties), it gets awfully fuzzy because you will effectively be mapping the file into the process twice (or more times). As I said earlier, I'm not exactly sure, and I can't find anything online that discusses it, what will happen at the OS level in that case. It might work fine, and give you the same handle back. It might give you a second handle to the same piece of memory and still work fine. Or it might get awfully confused, I know I am :)

        I simply do not know, but combining that uncertainty, with the vagaries of sharing a tie between threads leaves me in a place where I couldn't even hazard a guess as to what would be the outcome. I can't even get a handle on how to go about testing it? Given that your intention is to use this to try and debug other threads/processes/shared memory, I don't see how you would ever be able to arrive at a confidence level that would allow you to decide that your debug module wasn't exacerbating whatever problems already exist in the programs you are trying to debug.

        You say that plstrace program doesn't use any locking, you mention a ring buffer, and say that a little garbage in the stew won't hurt anything", but I don't see how you can draw any conclusions from what you see in snapshots taken of several asynchronous processes/threads, without knowing the order in which the memory you are looking at came to arrive at the values you will see. Actually, I'm not even sure how you would decide what was garbage and what was stew?


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.