Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Camel vs. Gopher

by reisinge (Hermit)
on Dec 08, 2018 at 19:16 UTC ( [id://1226977]=perlmeditation: print w/replies, xml ) Need Help??

I've been using Perl for several years mostly for small to medium sized programs of sysadmim type (automation, gluing, data transformation, log searching). Recently I started to learn Go. I wanted to write something in both languages and compare. Here goes.

The Perl code is more than 2 times smaller:

$ ls -l x.* | perl -lanE 'say "$F[8]\t$F[4] bytes"' x.go 694 bytes x.pl 294 bytes

Perl code is more than 4 times slower when run ...

$ time go run x.go > /dev/null real 0m1.222s user 0m1.097s sys 0m0.220s $ time perl x.pl > /dev/null real 0m5.358s user 0m4.778s sys 0m0.497s

... and more than 5 times slower when I built the Go code:

$ go build x.go $ time ./x > /dev/null real 0m0.947s user 0m0.890s sys 0m0.126s

The code generates 10 million random integers from 0 to 9. Than it counts the occurrence of each generated integer and prints it.

$ cat x.go package main import ( "fmt" "math/rand" "time" ) func main() { // Seed the random number generator seed := rand.NewSource(time.Now().UnixNano()) r1 := rand.New(seed) // Generate random integers var ints []int for i := 0; i < 10000000; i++ { n := r1.Intn(10) ints = append(ints, n) } // Count ints occurrence count := make(map[int]int) for _, n := range ints { count[n]++ } // Sort ints var intsSorted []int for n := range count { intsSorted = append(intsSorted, n) } // Print out ints occurrence for n := range intsSorted { fmt.Printf("%d\t%d\n", n, count[n]) } } $ cat x.pl #!/usr/bin/perl use warnings; use strict; # Generate random integers my @ints; push @ints, int rand 10 for 1 .. 10_000_000; # Count ints occurrence my %count; $count{$_}++ for @ints; # Print out ints occurrence for my $int ( sort keys %count ) { printf "%d\t%d\n", $int, $count{$int}; }

In conclusion I must say that I like both languages. I like beer too :-).

Always rewrite your code from scratch, prefefably twice. -- Tom Christiansen

Replies are listed 'Best First'.
Re: Camel vs. Gopher
by kschwab (Vicar) on Dec 08, 2018 at 21:20 UTC

    Well, if you need it fast, this seems faster than golang:

    $ # note, run it once before timing it...inline::c will cache the created code
    $ time perl x.pl
    0       999837
    1       999643
    2       998992
    3       999381
    4       1000629
    5       999501
    6       999830
    7       1001287
    8       1001751
    9       999149
    
    real    0m0.129s
    user    0m0.124s
    sys     0m0.004s
    

    I'm "cheating" though:

    #!/usr/bin/perl use Inline C; use strict; use warnings; my $count=doit(10_000_000); for my $int ( sort keys %{$count} ) { printf "%s\t%s\n", $int,$$count{$int}; } __END__ __C__ #include <string.h> SV* doit(int howmany) { HV* hv=newHV(); unsigned int count[10]={0}; time_t t; char key[2]; srand((unsigned) time(&t)); for (int i=0;i<howmany;i++) { count[rand()%10]++; } for (int i=0;i < 10;i++) { sprintf(key,"%d",i); hv_store(hv,key,1,newSVuv(count[i]),0); } return newRV_noinc((SV *)hv); }

      I find it mildly amusing that this is the second time in less than a week where a Perlmonks question was answered using the C rand() function :)

Re: Camel vs. Gopher
by Your Mother (Archbishop) on Dec 08, 2018 at 19:57 UTC

    Another Perl take, pretty idiomatic; seems to be a fair bit faster.

    @ten = ( 0 .. 9 ); $count{$ten[rand@ten]}++ for 1..10_000_000; print "$_ -> $count{$_}\n" for @ten;

      Go is cool. So is Perl. :) Below, demonstrations based on Your_Mother's example.

      Serial Code

      use strict; use warnings; use Time::HiRes 'time'; my $start = time; my @ten = ( 0 .. 9 ); my %count; $count{ $ten[ rand @ten ] }++ for 1..10_000_000; print "$_ -> $count{$_}\n" for @ten; printf "duration: %0.3f seconds\n", time - $start;

      Parallel Implementation

      Disclaimer: I typically use the OO interface for shared variables. The reason is exclusive locking handled automatically.

      use strict; use warnings; use MCE; use MCE::Shared; use Time::HiRes 'time'; my $start = time; my @ten = ( 0 .. 9 ); my $count = MCE::Shared->hash(); MCE->new( max_workers => 4, sequence => [ 1, 10_000_000 ], bounds_only => 1, chunk_size => 50_000, user_func => sub { my ( $mce, $seq, $chunk_id ) = @_; my %lcount; # compute using a local hash - involves zero IPC $lcount{ $ten[ rand @ten ] }++ for $seq->[0] .. $seq->[1]; # increment shared hash - one IPC per key while ( my ( $key, $val ) = each %lcount ) { $count->incrby( $key, $val ); } } )->run; printf "$_ -> %ld\n", $count->get($_) for @ten; printf "duration: %0.3f seconds\n", time - $start;

      Results running Perl v5.28, parallel time includes workers spawning-shutdown

      Serial 1.534 seconds Parallel 0.410 seconds 0 -> 999455 1 -> 1000312 2 -> 999828 3 -> 1001949 4 -> 999227 5 -> 997375 6 -> 1001048 7 -> 999806 8 -> 1000212 9 -> 1000788

      Regards, Mario

        The pipeline method batches multiple operations into a single IPC call. A smaller chunk_size value is needed to notice the difference.

        # increment shared hash - single IPC per chunk $count->pipeline( map { [ 'incrby', $_ => $lcount{$_} ] } keys %lcount );

        Comparison chunk_size => 1_000

        Before 0.814 seconds $count->incrby(...) After 0.522 seconds $count->pipeline(...)

        Regards, Mario

      IMO. OP comparison itself doesn't say much which lang is faster. Yours is slightly faster since 1..N does not create huge ass array in memory (since some version of Perl) whereas OP version does need to create/allocate 10_000_000 ints. (since it's stored in array)

        400% is not a slight speedup :|

Re: Camel vs. Gopher
by vr (Curate) on Dec 09, 2018 at 12:45 UTC

    Wait, did you say "short and fast and Perl"? One-liner, then?

    $ time perl -MPDL -e 'print transpose pdl long hist +(10*random 1e7),0 +,10,1' [ [ 0 998557] [ 1 998651] [ 2 1000878] [ 3 1001181] [ 4 1000788] [ 5 1000577] [ 6 1000108] [ 7 999979] [ 8 997430] [ 9 1001851] ] real 0m0.290s user 0m0.215s sys 0m0.074s

    Well, twice as slow as kschwab's code, but isn't he a cheater: you asked to "generate 10 million random integers", and only then "count the occurrence of each". If you ask where are integers above -- bin centers effectively "floor" random numbers in each bin, to bin's lower limit.

    P.S. Generating explicit integers as

    $ time perl -MPDL -e 'print transpose pdl long hist +(byte 10*random 1e7),0,10,1'

    is only very slightly longer (in both length and time).

Re: Camel vs. Gopher
by morgon (Priest) on Dec 12, 2018 at 09:55 UTC
    I like both languages
    Me too.

    Concurrency is finally easy and fun.

    Being faster then perl is only a bonus.

      Being faster then perl is only a bonus.

      Hey now. My direct, pure Perl answer was four times faster than the original Perl presented and not even twice as slow as the go; 70% slower on my box.

      vr's answer, with a well-known Perl module, is two times faster than the go and kschwab's answer is ten times faster than the go and still shorter by several lines a few dozen chars. Being faster is a matter of tuning and knowing the problem space. With the evidence presented, Perl is the clear speed, space, and cognitive load winner if you know how to match it with the problem.

        Sure :-)

        In this particular case where you don't even think about doing part of the work in concurrently a perl solution maybe can be optimized to even beat Go.

        But I can see that you're not being too serious...

        For me Go is the new Perl - a language with a rich ecosystem, that is fun to use and that makes things easy that where very difficult before...

      Concurrency and shared data have been fun and easy and fast in Perl for quite some time, for example with MCE and MCE::Shared.


      The way forward always starts with a minimal test.

        See the MCE and MCE::Shared demonstration here.

        That may be so.

        I see that you need more convincing, so here is another kill-feature of Go: Cross-compiling statically linked binaries.

        On my linux-system I can build binaries for Mac (or Windows) that do not have any dependencies and will simply work when I copy ONE file to another machine running a different OS.

        Do that with Perl.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://1226977]
Approved by marto
Front-paged by Discipulus
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (8)
As of 2024-03-28 12:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found