I'd appreciate some advice on multi-threading. I've never done this with Perl before, but I have an application that screams for threads.

Here is the basic code that I want convert to threading:

my %result; foreach my $token (@tokens) { my $regex = build_regex($token); my @line_results = grep {$_ ne $token and /$regex/ }@tokens; $result{$token} = [@line_results]; }
@tokens is an array of about 80K things. For each token, I build a custom regex for each particular token which is then run against all other tokens. What I'm looking for are the other tokens that "sort of match" each token. All of the rules for what counts as "close enough" are built into the build_regex() function. That whole subject is complicated and beyond my question here.

Yes, this is a brute force N*N execution time. Right now, this takes about 90 minutes. I want this to run at least 3x faster. I have a 4 core machine and that sounds feasible.

I have been looking at Re: Perl Threads Boss/Worker Example as an example of worker threads. I can see that I build a queue of the @tokens + 4 undef values for a 4 thread limit. What I don't understand is how the combined results can be generated without the threads "stepping upon each other" How can threads write to a global structure accessible to them all?


In reply to Multi-thread combining the results together by Marshall

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.