in reply to Thread sharing a bit vector??

Any insight into how I can share a Bit::Vector object, if at all? The threads::shared doc discusses some support for object sharing, but if anyone knows how to translate that into my simple example above, I would *really* appreciate it!

Dunno! I've never been able to work that out either. I think it might only work if Bit::Vector was modified to use the version of bless exported by threads::shared, but I suggest you contact the maintainer to get the real skinny on it.

However, I don't recommend sharing objects between threads anyway. Unless the module is written from the ground up with threading in mind, it just leads to a world of hurt. In the case of bit vectors it seems like a particularly bad idea.

If you could briefly describe your usage for Bit::Vector with threading, I may be able to help you. I have a simple XS bit vector module that bypasses threads::shared to allow multi-threaded use without the overheads and restrictions that module brings.

If your needs are simple, it might be possible for me to clean that up to the point were it is usable by others.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

The start of some sanity?

Replies are listed 'Best First'.
Re^2: Thread sharing a bit vector??
by traceyfreitas (Sexton) on Apr 11, 2012 at 22:51 UTC

    Thanks for responding so fast!

    This is used in a rather *large* Perl program. Basically, I have thousands of arrays (currently ~3500) that could potentially contain millions of bit vectors each. Each bit vector size will be equal, likely 50 bits (at minimum) or 200 bits (at largest). I'll use the smallest in this example:

    # Create the bit vectors my $bv1 = Bit::Vector->new(50); my $bv2 = Bit::Vector->new(50); ... # Keep associated bv's together my @source1 :shared = ($bv1a,$bv2a,...,$bv85000a,...$bv25000000a); my @source2 :shared = ($bv1b,$bv2b,...,$bv85000b); ... # Add these arrays to a single, global array my @sources :shared = (\@source1, \@source2, ...);

    These may not all fit into memory, so I would have to use Storable's store() and retrieve() to dump them to and pull them from disk when need. If they do fit into memory, I would like one shared array or hash that holds references to all these shared arrays of bit vectors, because I will be performing pairwise set intersections on all of them (yep, many-to-many). So I would want one copy of this hash/array in memory shared amongst all threads so each could pick and choose what they need and when to do their subset of intersections.

    The RESULTS of these intersections would be bit vectors as well and after a thread has finished computing the intersections on its bunch of bit vectors, I would like to add these bit vectors to a different shared "results" hash of bit vectors that organizes the results of these source|target set comparisons, and move on from there.

    # One pairwise intersection: # $intrx1_2 is a bit vector whose bits correspond to the positions # in @source1 whose bit vectors intersected with those in @source2; my ($intrx1_2, $intrx2_1) = set_intersection(\@source1,\@source2); # Storing the results in a globally shared hash $globally_shared_hash{$source1}->{$target2} = $intrx1_2; $globally_shared_hash{$source2}->{$target1} = $intrx2_1;

    Downstream subs() will additionally process these bit vectors, so to prevent re-loading a thread's subset of bit vectors to process from disk, I'd rather them just read a shared memory space to avoid the hit of Storable's retrieve().

    The resulting globally shared hash would look something like:

    %globally_shared_hash = ( $source1 => { $target2 => $bv1_2, $target3 => $bv1_3, $target4 => $bv1_4, ... }, $source2 => { $target1 => $bv2_1, $target3 => $bv2_3, $target4 => $bv2_4, ... }, ... );

    Currently, I used the following methods from the Bit::Vector package:

    Bit::Vector->new() # bv constructor $vec->to_Hex() # bv -> HEX string $vec->to_Bin() # bv -> BINARY string $vec->Clone() # new vector, exact duplicate $vec->Size() # gets length of bv $vec->Reverse() # reverses bv $vec->bit_test($index) # 0 or 1 $vec->bit_flip($index) # flips bv's bit at $index $vec->Bit_On($index) # turn bit on $vec->Bit_Off($index) # turn bit off $vec->Interval_Scan_dec # grabs (min,max) of next chunk of 0's $vec->Lexicompare($vec2) # +1,0, or -1

    Think your XS could alleviate my problem or do you think I just need to be more creative with how I manage thread-local data?

      Think your XS could alleviate my problem

      As it currently exists, it does not support the full range of operations you require.

      It could be extended to do so, but it would require some considerable effort to do so in a portable manner as it currently relies on Windows-specific memory management and MS Compiler intrinsic semantics.

      or do you think I just need to be more creative with how I manage thread-local data?

      I'm sorry to say that I don't believe that threads::shared is currently capable of doing what you need it to do. Because of that modules internal implementation, sharing large volumes of data across threads is not currently a viable option.

      The only viable solution to your description that I am aware of at this time would be to use a PostgreSQL DB and its BitString types & operators.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      The start of some sanity?