in reply to Efficiency comparisons

I'm baffled that Perl beats C in anything. How could it, given that perl is written in C? Anyway, Perl isn't at all geared at numeric problems. However, the "sieve" problem isn't what I would call a numerical computation - no floats are involved, all you need to do is clever addition. As for memory, for this particular problem, Perl and C should use the same amount of memory. If C loses, the C program wasn't written well.

Think about it, for a "sieve", your datastructure is nothing more than a large bit string. For which Perl gives you vec to do handy manipulations with. But in reality, it's just string (Perl), which turns out to be a character array (C). Hence, no reason the C program should use more memory than the Perl one - unless it wasn't written well.

Below is a sieve program I once wrote in Perl. To find all primes less than 100 million (as in the article), it uses less than 6 Mb. And that includes the interpreter. There's no reason a C program should use more memory.

#!/opt/perl/bin/perl -l # A sieve based on base 6. Only possible primes are those numbers p # for which p mod 6 == 1, or p mod 6 == 5. # # Let n = 6 * k + l (0 <= l < 6). Then the bit b associated with n is # b = 2 * k + [undef, 0, undef, undef, undef, 1] -> [l] # In reverse, given a bit b, the corresponding number n is # n = 6 * int (b / 2) + [1, 5] -> [b % 2]. use strict; use warnings 'all'; sub scrub ($); sub findnext ($); my $max = $ARGV [0] || 1000; my @PRIMES = (2, 3); my $BASE = 6; # LCM of @PRIMES. my $BITS_BASE = 2; my $BITS_BYTE = 8; my $bits = $BITS_BASE * int ($max / $BASE) + ($max % $BASE ? $BITS_BA +SE : 0); my $bytes = int ($bits / $BITS_BYTE) + ($bits % $BITS_BYTE ? 1 : 0); my $sieve = eval qq !"\xFF" x $bytes!; # Half the memory usuage. # Numbers less than the base are special. foreach my $prime (@PRIMES) { print $prime } my $i = 5; for (; $i && $i <= sqrt ($max); $i = findnext $i) { print $i; scrub $i; } for (; $i && $i < $max ; $i = findnext $i) { print $i; } exit; my ($offset1, $offset2); INIT { $offset1 = [undef, 0, undef, undef, undef, 1]; $offset2 = [1, 5]; } # Scrub out all multiples of the given argument. It's easy to see that # all multiples less than the square already have been crossed out, an +d # we only have to scrub out the odd multiples. sub scrub ($) { use integer; my $n = shift; my $m = $n; if ($m % 3 == 1) { my $c = $n * $m; vec ($sieve, $BITS_BASE * ($c / $BASE) + $offset1 -> [$c % $BA +SE], 1) = 0; $m += 4; } my $c = $n * $m; # $b is the bit index in the sieve, $b1 and $b2 are the increments # in the loop. my $b = $BITS_BASE * ($c / $BASE) + $offset1 -> [$c % $BASE]; my $b1 = (2 * $n / ($BASE / $BITS_BASE)); my $b2 = (4 * $n / ($BASE / $BITS_BASE)); # Exactly one of $b1, $b2 needs to be incremented by 1, # depending on the parity of $c. ($c % 6 == 1 ? $b2 : $b1) += 1; # Make sure we don't go out of range. my $l = $bits - $b1 - $b2; # Core loop, we want to minimize work here. while ($b <= $l) { vec ($sieve, $b, 1) = 0; $b += $b1; vec ($sieve, $b, 1) = 0; $b += $b2; } # Basically a copy of the previous loop, but with extra checks. # Checks don't hurt now as this loop is performed only once. { last if $b > $bits; vec ($sieve, $b, 1) = 0; $b += $b1; last if $b > $bits; vec ($sieve, $b, 1) = 0; $b += $b2; } } # Find the next prime number after the given number. sub findnext ($) { use integer; my $n = shift; my $i = $BITS_BASE * ($n / $BASE) + $offset1 -> [$n % $BASE] + 1; while ($i <= $bits) { if (vec ($sieve, $i, 1)) { return $BASE * ($i / $BITS_BASE) + $offset2 -> [$i % $BITS +_BASE] } $i ++ } undef; } __END__

Takes 6 minutes on my laptop to find all primes less than 100 million.

Abigail

Replies are listed 'Best First'.
Re: Re: Efficiency comparisons
by hv (Prior) on Feb 25, 2003 at 01:12 UTC
    I'm baffled that Perl beats C in anything. How could it, given that perl is written in C?

    A perl program can (in some cases) beat a C program for the same task given the same amount of programmer effort, because there are optimisations built into the C code for perl that encapsulate more effort than you'd normally ever consider worthwhile for a single application.

    Given unbounded effort, the C program will always at least match the perl program. Few people or organisations have the resources for that, though.

    Hugo