Dirk80 has asked for the wisdom of the Perl Monks concerning the following question:

Hello,

I'm using threads and a tk gui. I know that I have to be careful with this because tk is not thread-safe.

So I created a thread before creating the tk gui. And my thread has an idle and a work state. Initially it is in the idle state. When pressing a generate button in the tk gui then the state (shared variable) is changed to work and a data structure is passed via a queue to the thread and the thread is doing its work. Afterwards it changes back to idle state.

Everything is working fine apart from the performance.

Here you see the data which is filled by my tk gui:

my %gen_data = ( lri => { major_version => 1, minor_version => 10, is_integration_flag_active => FALSE, module_id => 0x01, output_file => "D:/out.sre" }, pm => { major_version => 4, minor_version => 2, build_version => 82, input_file => "D:/in.s3" }, gp => { is_active => TRUE, major_version => 3, minor_version => 4, build_version => 7341, nb_bytes_data_field => 16, root_path => "D:/temp_path", tranche => T2 });

The main task in which tk is running is then doing the enqueuing. $ref_gen_data is a reference to the data structure filled via the tk gui.

# lri $q->enqueue($ref_gen_data->{'lri'}{'major_version'}, $ref_gen_data->{'lri'}{'minor_version'}, $ref_gen_data->{'lri'}{'is_integration_flag_active'}, $ref_gen_data->{'lri'}{'module_id'}, $ref_gen_data->{'lri'}{'output_file'}); # pm $q->enqueue($ref_gen_data->{'pm'}{'major_version'}, $ref_gen_data->{'pm'}{'minor_version'}, $ref_gen_data->{'pm'}{'build_version'}, $ref_gen_data->{'pm'}{'input_file'}); # gp $q->enqueue($ref_gen_data->{'gp'}{'is_active'}, $ref_gen_data->{'gp'}{'major_version'}, $ref_gen_data->{'gp'}{'minor_version'}, $ref_gen_data->{'gp'}{'build_version'}, $ref_gen_data->{'gp'}{'nb_bytes_data_field'}, $ref_gen_data->{'gp'}{'root_path'}, $ref_gen_data->{'gp'}{'tranche'});

The worker thread is doing the dequeuing to get the data.

my %gen_data; # lri $gen_data{'lri'}{'major_version'} = $q->dequeue(); $gen_data{'lri'}{'minor_version'} = $q->dequeue(); $gen_data{'lri'}{'is_integration_flag_active'} = $q->deque +ue(); $gen_data{'lri'}{'module_id'} = $q->dequeue(); $gen_data{'lri'}{'output_file'} = $q->dequeue(); # pm $gen_data{'pm'}{'major_version'} = $q->dequeue(); $gen_data{'pm'}{'minor_version'} = $q->dequeue(); $gen_data{'pm'}{'build_version'} = $q->dequeue(); $gen_data{'pm'}{'input_file'} = $q->dequeue(); # gp $gen_data{'gp'}{'is_active'} = $q->dequeue(); $gen_data{'gp'}{'major_version'} = $q->dequeue(); $gen_data{'gp'}{'minor_version'} = $q->dequeue(); $gen_data{'gp'}{'build_version'} = $q->dequeue(); $gen_data{'gp'}{'nb_bytes_data_field'} = $q->dequeue(); $gen_data{'gp'}{'root_path'} = $q->dequeue(); $gen_data{'gp'}{'tranche'} = $q->dequeue();

Now the summary: If I do it like this the thread is very slow. Algorithm needs about 1 hour.

I found out that the problem are the major and minor version of the gp. If I use the following code (replacing major and minor version of gp with sample values) for enqueuing the thread is working very fast (about 30s).

# lri $q->enqueue($ref_gen_data->{'lri'}{'major_version'}, $ref_gen_data->{'lri'}{'minor_version'}, $ref_gen_data->{'lri'}{'is_integration_flag_active'}, $ref_gen_data->{'lri'}{'module_id'}, $ref_gen_data->{'lri'}{'output_file'}); # pm $q->enqueue($ref_gen_data->{'pm'}{'major_version'}, $ref_gen_data->{'pm'}{'minor_version'}, $ref_gen_data->{'pm'}{'build_version'}, $ref_gen_data->{'pm'}{'input_file'}); # gp $q->enqueue($ref_gen_data->{'gp'}{'is_active'}, 3, 4, $ref_gen_data->{'gp'}{'build_version'}, $ref_gen_data->{'gp'}{'nb_bytes_data_field'}, $ref_gen_data->{'gp'}{'root_path'}, $ref_gen_data->{'gp'}{'tranche'});

It is so strange. Why is the thread so slow If I use the major and minor version of gp and why is it fast if I put in sample values instead. I do not understand why. I hope that you can give me some hints.

I also tried to use shared variables instead of the queue. But here was exactly the same behaviour.

The perl script is running on a windows xp.

Thank you very much in advance.

Dirk

Replies are listed 'Best First'.
Re: Thread very slow
by Corion (Patriarch) on Jul 12, 2010 at 09:17 UTC

    Maybe tied the variables in your Tk GUI, so they automatically update whenever they get changed from the GUI? And it seems that passing these variables into the thread queue did not dissociate them, which seems weird to me.

    Personally, I would only pass values through queues, that is, copy the values out. Also, I wouldn't pass multiple different values to the thread but pass in all values for one job in one go:

    use Data::Dumper; my $job = {}; for my $field (qw( lri pm gp )) { $job->{ $field } = { %$gen_data{ $field } }; warn "Processing " . Dumper $job; sleep 10; }; warn "Enqueuing " . Dumper $job; $q->enqueue( $job ); $q->enqueue( undef ); # tell the thread that we're done and it should +quit ... # In the thread while (my $job = $q->dequeue) { };

      Thanks. Your advice with Data::Dumper helped me a lot. Your code version was not the final solution because the thread was still slow. But it helped me to find the solution. Now the thread is running fast.

      Here the code of the main task:

      $Data::Dumper::Varname = "gen_data_str"; $gen_data_str1 = Dumper($ref_gen_data); $q->enqueue( $gen_data_str1 ); $q->enqueue( undef );

      Here the code within the worker thread:

      my %gen_data; while( my $gen_data_str1 = $q->dequeue ) { %gen_data = %{ eval $gen_data_str1 }; };

        I think there must be still something going wrong with your serialization, as Thread::Queue basically does what you do, except it's using the (slightly faster in most cases) Storable::nstore to serialize the data for the queue.

        As you haven't shown the code where you construct %gen_data and the values you construct them from, it's hard to guess. If you can construct an example without Tk that exhibits the behaviour, that would be convenient. If you can post (a small excerpt of) how you populate %gen_data, that might also lead to more clue as to what really goes wrong.

Re: Thread very slow
by BrowserUk (Patriarch) on Jul 12, 2010 at 10:40 UTC

    Sorry. But until I see it (and reproduce it), I don't believe that your diagnosis of the problem is correct.

    Whichever of the "two ways" you show to enqueue your data, you are actually doing exactly the same thing. In both cases, you simply passing a list of values to enqueue().

    Regardless of whether you extract those values from within a hash, or pass them as constants, what will end up on the queue, and therefore be received by the thread, is identical. And so this cannot be the reason for any significant change in performance. It simply cannot be.

    I'm not suggesting that you aren't seeing a performance difference, nor even that the change you document isn't somehow triggering it; but it cannot directly be the cause.

    I can only suggest that you either post the full code here, or if it is too big (>64k I think), stick it on a pastebin somewhere and link to it. Or you can send it to me by email.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Thread very slow
by BrowserUk (Patriarch) on Jul 17, 2010 at 00:36 UTC

    Corion called it right!

    The simplest fix is to modify:

    $ref_gen_data->{'gp'}{'major_version'}, $ref_gen_data->{'gp'}{'minor_version'},

    to be:

    0 + $ref_gen_data->{'gp'}{'major_version'}, 0 + $ref_gen_data->{'gp'}{'minor_version'},

    That change will immediately reduce the runtime (on my system) from 27 minutes to 2 seconds.

    The source of the problem is that those two elements of the hash are referenced in -textvariable' clauses in Entry() elements within your GUI. This causes Tk to apply tie magic to both variables, which effectively means that every (even read) reference to these variables becomes a chain of function calls to resolve and apply the magic. As the "holder" of the magic is running on a different thread, it also involves multiple context switches. And hence, r-e-a-l-l-y s-l-o-w.

    The surprising thing here is that the attached magic survives the transmission of the values of the hash elements between threads via a Thread::Queue. This appears to be by (re)design, when share magic was modified to allow the sharing of (only specially written) objects between threads.

    The reason that adding 0 to the values prior to enqueuing them works, is because it forces the value into the IV slot rather than the PV slot. And it appears the magic is only applied to the PV slot.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.