in reply to Sharing large data structures between threads

Okay then, if the number of threads is excluded as a matter of discussion, what about the roles of these various threads?   Are they all simply clones of one another?   Would it be possible, for example, to delegate one thread that actually owns the data-structure, and that provides information about it to the other processes on-request?

I really get nervous about “highly pervasive” code changes, like rewriting the code to use one-dimensional hashes and so forth.   Even though they might “solve” the problem, they tend to introduce a lot of instability into the code.   Per contra, the “arkane magick” of accessing the data-structure could perhaps be spun off into a single object, which encapsulates all of the necessary incantations for actually getting to the thing (or perhaps to a relevant slice of it).   That would allow you to utter some pretty darned outrageous “spells,” and since you are only uttering them in one place, to still keep the code maintainable.

The notion of using tie is a good one, because the tie mechanism is pretty darned magickal already.   I would definitely explore that idea further.

Replies are listed 'Best First'.
Re^2: Sharing large data structures between threads
by Anonymous Monk on Mar 09, 2011 at 23:39 UTC
    The Windows memory-manager is known to be quite “lazy” and sometimes this characteristic produces memory-consumption that is quite larger than you intuitively suppose it should be.

    My original problems started with "Out of memory" errors caused by Perl having eaten its 32-bit address space... Yes, I did set the thread stack_size ages ago.

    Would it be possible, for example, to delegate one thread that actually owns the data-structure, and that provides information about it to the other processes on-request?

    Yes, I tried this. The results were not as good as I expected. See a few post up.

    I have now been working on using HTTP::Async to reduce the number of threads. After several fixes to HTTP::Async itself, I am starting to see improvement. While I am pretty happy with the results, I do regret the additinal layer of complexity I had to incorporate into my script.

    To BrowserUk: I fully appreciate your comments but I don't think it would be fruitful to discuss the application design here. For starters, if starting ground-up now, I would most certainly do many things differently. :-)

      Strictly speaking, by “lazy” I simply meant that it does not hurry-up to reclaim memory until it is truly pressured to do so.

      “It is,” as a friend of mine would once diplomatically put it, “suboptimal ...”
      (I think he was secretly a Vulcan.)

      Nevertheless, and (of course) knowing nothing at all about the particulars of this application, I do think that you are compelled to find a practical way to “throttle” the maximum number of threads that exist, and to use some kind of a queue to absorb the moment-to-moment fluctuations in the incoming workload.   Not an ideal situation, perhaps, but I see little available choice in the matter.