Re: Wanting some clarification / opinions on MCE vs Threads
by BrowserUk (Patriarch) on Feb 05, 2015 at 07:53 UTC
|
| [reply] |
|
|
use MCE::Grep;
my @a = mce_grep { $_ % 5 == 0 } 1..10000;
versus
use threads;
use Thread::Queue;
my @thread_pool;
my $q = Thread::Queue->new();
my $results = Thread::Queue->new();
for (0..10000)
{
$q->enqueue($_);
}
for (0..1)
{
push @thread_pool, threads->create( \&grep );
}
sub grep
{
while (my $work = $q->dequeue() )
{
if ( $work % 5 == 0 )
{
$results->enqueue($work);
}
}
$q->enqueue(undef);
}
map {$_->join(); } (@thread_pool);
$results->enqueue(undef);
my @results;
while ( my $result = $results->dequeue() )
{
print $result, "\n";
push @results, $result;
}
I'm sure that the threads version could be done much more easily than I hacked together in 5 minutes. It could just be that how I write using threads is just poor. Regardless, I don't think there is a threads implementation as simple as the MCE version. Additionally, this is also a special case where MCE has a built-in function that provides this functionality, but there are similar constructs for most of the simple cases. For what I do there isn't much that can't be implemented using some mixture of MCE::Grep, MCE::Map and MCE::Loop so I'm biased.
I should also note that I haven't written very much (no "production" code as it were) using MCE so I may not have encountered some of its limitations compared to threads. | [reply] [d/l] [select] |
|
|
(I should say my threads equivalent)
Hm. No offence but, Ew! :)
Remember that MCE is a wrapper (actually a suite of wrappers, or is that sweet wrapper:) over the top of threads(and other things), providing syntactic sugar for simple operations.
You can easily do the same yourself. Say write TGrep.pm: package TGrep;
use strict;
use threads;
use Thread::Queue;
our $WORKERS = 4;
sub g_proxy {
my( $code, $Qin, $Qout ) = @_;
$Qout->enqueue( $code->() ? $_ : () ) while local $_ = $Qin->dequ
+eue;
$Qout->enqueue( undef );
}
sub tgrep(&@) {
my $workers = $WORKERS;
my $code = shift;
my( $Qin, $Qout ) = map Thread::Queue->new, 1..2;
async( \&g_proxy, $code, $Qin, $Qout )->detach for 1 .. $workers;
$Qin->enqueue( map{ ref $_[0] ? @{ $_ } : $_ } @_ );
$Qin->enqueue( (undef) x $workers );
my @results;
push @results, $_ while $_ = $Qout->dequeue;
return wantarray ? @results : \@results;
}
sub import {
no strict 'refs';
my $pkg = caller;
*{ $pkg . '::' . $_ } = *{ $_ } for qw[ tgrep ];
}
1;
Then write:
#! perl -slw
use strict;
use TGrep;
use Time::HiRes qw[ time ];
our $N //= 1e3;
my $start = time;
my @a = tgrep{ $_ % 5 == 0 } [ 1..$N ];
printf "Took %.9f seconds\n", time() -$start;
print scalar @a;
__END__
C:\test>t-tgrep -N=1e5
Took 3.830074787 seconds
20000
Of course, you'd probably be better sticking to grep for such simple things: $t=time;
my @a = grep{ $_%5 == 0 } 1 .. 1e6;
print time-$t;;
0.18144702911377
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
? @{ $_ } : $_ } @_ );
$Qin- | [reply] [d/l] [select] |
|
|
|
|
|
|
|
|
|
|
|
|
|
#!/usr/bin/perl --
use strict; use warnings;
use threads ;
use Thread::Queue;
Main( @ARGV );
exit( 0 );
sub threads_grep(&@) {
my $cb = shift;
my $max = int @_;
my $args = ref $_[0] ? shift : { slaves => 4, maxdq => $max * 10
+00 };
my $slaves = $args->{slaves} || 4;
my $maxdq = $args->{maxdq} || 1e9;
my $qin = Thread::Queue->new( @_, ( undef ) x $slaves );
my $qout = Thread::Queue->new();
my @kids = map {
threads->create(
sub { ## threads_grep_cb
my( $cb, $qin, $qout ) = @_;
local $_;
while( $_ = $qin->dequeue ) {
if( $cb->() ) {
$qout->enqueue( $_ );
}
}
warn 'tids ahoy ', threads->tid;
return;
},
$cb,
$qin,
$qout
);
} 1 .. $slaves;
$_->join for @kids;
$qin->end;
$qout->end;
return $qout->dequeue( $maxdq );
} ## end sub threads_grep(&@)
sub Main {
#~ my @res = threads_grep { $_ % 5 == 0 } 1..1000;
my @res = threads_grep { $_ % 5 == 0 } { slaves => 2 }, 1..1000;
print "@res\n";
} ## end sub Main
__END__
tids ahoy 1 at - line 26.
tids ahoy 2 at - line 26.
5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 11
+5 120 125 130 135 140 145 150 155 160 165 170 175
180 185 190 195 200 205 210 215 220 225 230 235 240 245 250 255 260 26
+5 270 275 280 285 290 295 300 305 310 315 320 325
330 335 340 345 350 355 360 365 370 375 380 385 390 395 400 405 410 41
+5 420 425 430 435 440 445 450 455 460 465 470 475
480 485 490 495 500 505 510 515 520 525 530 535 540 545 550 555 560 56
+5 570 575 580 585 590 595 600 605 610 615 620 625
630 635 640 645 650 655 660 665 670 675 680 685 690 695 700 705 710 71
+5 720 725 730 735 740 745 750 755 760 765 770 775
780 785 790 795 800 805 810 815 820 825 830 835 840 845 850 855 860 86
+5 870 875 880 885 890 895 900 905 910 915 920 925
930 935 940 945 950 955 960 965 970 975 980 985 990 995 1000
| [reply] [d/l] |
Re: Wanting some clarification / opinions on MCE vs Threads
by marioroy (Prior) on Feb 12, 2015 at 07:12 UTC
|
MCE began life as a chunking engine with support for serialized output or action; e.g. serializing log data to a single file and not worry about many workers writing simultaneously; e.g. MCE->print($LOG_FH, "$msg\n");
The native grep function will typically run faster for small code. Below, mce_grep has low overhead due to chunking input. Output order is also preserved (not shown).
# $N = 1e6;
TGrep......: Took 30.264018774 seconds (4 workers)
mce_grep...: Took 0.299300909 seconds (4 workers)
native grep: Took 0.106141806 seconds
One reason for using MCE is wanting the freezing and thawing of data done automatically between the manager process and workers or vice versa. Another likely reason is running MCE with AnyEvent or Mojo and benefitting from chunking; e.g. each worker receives 300 hosts or URLs at a time and processing the chunk with desired event loop.
MCE::Queue is not necessary when threads is desired. One can still use Thread::Queue unless wanting priority queues possible with MCE::Queue. Perhaps, Perl is not built with threads support (common on some platforms). Both MCE::Queue and MCE::Mutex support threads and processes.
The next update will include Tutorial.pod demonstrating parallelism for various CPAN modules. | [reply] [d/l] |
Re: Wanting some clarification / opinions on MCE vs Threads
by marioroy (Prior) on Feb 12, 2015 at 08:28 UTC
|
For the curious, here is a version of TGrep.pm by BrowserUk modified to use MCE. Simply remove or comment out "use threads" if Perl lacks support for threads. Threads is not necessary for MCE::Queue.
package TGrep;
use strict;
use threads;
use MCE;
use MCE::Queue;
our $WORKERS = 4;
sub tgrep(&@) {
my $workers = $WORKERS;
my $code = shift;
my @results;
my $Qin = MCE::Queue->new( fast => 1 );
my $Qout = MCE::Queue->new( queue => \@results );
my $mce = MCE->new(
max_workers => $workers, user_func => sub {
$Qout->enqueue( $code->() ? $_ : () ) while local $_ = $Qin-
+>dequeue;
}
)->spawn;
$Qin->enqueue( map{ ref $_[0] ? @{ $_ } : $_ } @_ );
$Qin->enqueue( (undef) x $workers );
$mce->run;
return wantarray ? @results : \@results;
}
sub import {
no strict 'refs';
my $pkg = caller;
*{ $pkg . '::' . $_ } = *{ $_ } for qw[ tgrep ];
}
1;
Thus, new results emerges.
# $N = 1e6 (4 workers)
TGrep....: Took 30.264018774 seconds
TGrep MCE: Took 15.292614937 seconds (fast => 0)
TGrep MCE: Took 7.824651003 seconds (fast => 1)
I was curious about how MCE::Queue compares to Thread::Queue for the demonstration. The fast option is beneficial for an already populated queue. | [reply] [d/l] [select] |
|
|
| [reply] |