Re: Camel vs. Gopher
by kschwab (Vicar) on Dec 08, 2018 at 21:20 UTC
|
Well, if you need it fast, this seems faster than golang:
$ # note, run it once before timing it...inline::c will cache the created code
$ time perl x.pl
0 999837
1 999643
2 998992
3 999381
4 1000629
5 999501
6 999830
7 1001287
8 1001751
9 999149
real 0m0.129s
user 0m0.124s
sys 0m0.004s
I'm "cheating" though:
#!/usr/bin/perl
use Inline C;
use strict;
use warnings;
my $count=doit(10_000_000);
for my $int ( sort keys %{$count} ) {
printf "%s\t%s\n", $int,$$count{$int};
}
__END__
__C__
#include <string.h>
SV* doit(int howmany) {
HV* hv=newHV();
unsigned int count[10]={0};
time_t t;
char key[2];
srand((unsigned) time(&t));
for (int i=0;i<howmany;i++) {
count[rand()%10]++;
}
for (int i=0;i < 10;i++) {
sprintf(key,"%d",i);
hv_store(hv,key,1,newSVuv(count[i]),0);
}
return newRV_noinc((SV *)hv);
}
| [reply] [Watch: Dir/Any] [d/l] |
|
| [reply] [Watch: Dir/Any] [d/l] |
|
| [reply] [Watch: Dir/Any] |
Re: Camel vs. Gopher
by Your Mother (Archbishop) on Dec 08, 2018 at 19:57 UTC
|
Another Perl take, pretty idiomatic; seems to be a fair bit faster.
@ten = ( 0 .. 9 );
$count{$ten[rand@ten]}++ for 1..10_000_000;
print "$_ -> $count{$_}\n" for @ten;
| [reply] [Watch: Dir/Any] [d/l] |
|
use strict;
use warnings;
use Time::HiRes 'time';
my $start = time;
my @ten = ( 0 .. 9 );
my %count;
$count{ $ten[ rand @ten ] }++ for 1..10_000_000;
print "$_ -> $count{$_}\n" for @ten;
printf "duration: %0.3f seconds\n", time - $start;
Parallel Implementation
Disclaimer: I typically use the OO interface for shared variables. The reason is exclusive locking handled automatically.
use strict;
use warnings;
use MCE;
use MCE::Shared;
use Time::HiRes 'time';
my $start = time;
my @ten = ( 0 .. 9 );
my $count = MCE::Shared->hash();
MCE->new(
max_workers => 4,
sequence => [ 1, 10_000_000 ],
bounds_only => 1,
chunk_size => 50_000,
user_func => sub {
my ( $mce, $seq, $chunk_id ) = @_;
my %lcount;
# compute using a local hash - involves zero IPC
$lcount{ $ten[ rand @ten ] }++ for $seq->[0] .. $seq->[1];
# increment shared hash - one IPC per key
while ( my ( $key, $val ) = each %lcount ) {
$count->incrby( $key, $val );
}
}
)->run;
printf "$_ -> %ld\n", $count->get($_) for @ten;
printf "duration: %0.3f seconds\n", time - $start;
Results running Perl v5.28, parallel time includes workers spawning-shutdown
Serial 1.534 seconds
Parallel 0.410 seconds
0 -> 999455
1 -> 1000312
2 -> 999828
3 -> 1001949
4 -> 999227
5 -> 997375
6 -> 1001048
7 -> 999806
8 -> 1000212
9 -> 1000788
Regards, Mario | [reply] [Watch: Dir/Any] [d/l] [select] |
|
# increment shared hash - single IPC per chunk
$count->pipeline(
map { [ 'incrby', $_ => $lcount{$_} ] } keys %lcount
);
Comparison chunk_size => 1_000
Before 0.814 seconds $count->incrby(...)
After 0.522 seconds $count->pipeline(...)
Regards, Mario | [reply] [Watch: Dir/Any] [d/l] [select] |
|
| [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] |
Re: Camel vs. Gopher
by vr (Curate) on Dec 09, 2018 at 12:45 UTC
|
Wait, did you say "short and fast and Perl"? One-liner, then?
$ time perl -MPDL -e 'print transpose pdl long hist +(10*random 1e7),0
+,10,1'
[
[ 0 998557]
[ 1 998651]
[ 2 1000878]
[ 3 1001181]
[ 4 1000788]
[ 5 1000577]
[ 6 1000108]
[ 7 999979]
[ 8 997430]
[ 9 1001851]
]
real 0m0.290s
user 0m0.215s
sys 0m0.074s
Well, twice as slow as kschwab's code, but isn't he a cheater: you asked to "generate 10 million random integers", and only then "count the occurrence of each". If you ask where are integers above -- bin centers effectively "floor" random numbers in each bin, to bin's lower limit.
P.S. Generating explicit integers as
$ time perl -MPDL -e 'print transpose pdl long hist +(byte 10*random 1e7),0,10,1'
is only very slightly longer (in both length and time). | [reply] [Watch: Dir/Any] [d/l] [select] |
Re: Camel vs. Gopher
by morgon (Priest) on Dec 12, 2018 at 09:55 UTC
|
I like both languages
Me too.
Concurrency is finally easy and fun.
Being faster then perl is only a bonus.
| [reply] [Watch: Dir/Any] |
|
Being faster then perl is only a bonus.
Hey now. My direct, pure Perl answer was four times faster than the original Perl presented and not even twice as slow as the go; 70% slower on my box.
vr's answer, with a well-known Perl module, is two times faster than the go and kschwab's answer is ten times faster than the go and still shorter by several lines a few dozen chars. Being faster is a matter of tuning and knowing the problem space. With the evidence presented, Perl is the clear speed, space, and cognitive load winner if you know how to match it with the problem.
| [reply] [Watch: Dir/Any] |
|
Sure :-)
In this particular case where you don't even think about doing part of the work in concurrently a perl solution maybe can be optimized to even beat Go.
But I can see that you're not being too serious...
For me Go is the new Perl - a language with a rich ecosystem, that is fun to use and that makes things easy that where very difficult before...
| [reply] [Watch: Dir/Any] |
|
|
|
|
| [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] |
|
That may be so.
I see that you need more convincing, so here is another kill-feature of Go: Cross-compiling statically linked binaries.
On my linux-system I can build binaries for Mac (or Windows) that do not have any dependencies and will simply work when I copy ONE file to another machine running a different OS.
Do that with Perl.
| [reply] [Watch: Dir/Any] |
|