I just read an interesting article on Newsfactor.com. A student researcher made memory and time measurements for two different problems (text search and sieve of Eratosthenes) in four programming languages: C, Perl, Java, and Basic. Not surprisingly, the programs in C ran the fastest and used the least memory overall.

But I was surprised that in the relative rankings of the other languages, Perl was the second to most memory-efficient (in her terminology) for the text search problem, and most memory-efficient in the numeric problem, even beating out C. I had always believed the "common knowledge" that Perl used lots of memory, favoring memory use wherever it could gain in execution speed. The time efficiency numbers are not too shabby for Perl either, but the memory efficiency was a real surprise to me.

The article doesn't provide any code, and even the researcher admits she was not knowledgeable of programming when she started, so take the results with a grain of salt. Can anyone find the paper or code online? I couldn't (not lazily, anyway ;-) It was presented at this year's AAAS conference.


Given the recent inter-language flamage that's ocurred in the monastery, I want to add this clarification. My interest in the paper is not "My language wins, your language loses", but that a belief I'd held about Perl appears to be not borne out experimentally.

Replies are listed 'Best First'.
Re: Efficiency comparisons
by Abigail-II (Bishop) on Feb 25, 2003 at 00:02 UTC
    I'm baffled that Perl beats C in anything. How could it, given that perl is written in C? Anyway, Perl isn't at all geared at numeric problems. However, the "sieve" problem isn't what I would call a numerical computation - no floats are involved, all you need to do is clever addition. As for memory, for this particular problem, Perl and C should use the same amount of memory. If C loses, the C program wasn't written well.

    Think about it, for a "sieve", your datastructure is nothing more than a large bit string. For which Perl gives you vec to do handy manipulations with. But in reality, it's just string (Perl), which turns out to be a character array (C). Hence, no reason the C program should use more memory than the Perl one - unless it wasn't written well.

    Below is a sieve program I once wrote in Perl. To find all primes less than 100 million (as in the article), it uses less than 6 Mb. And that includes the interpreter. There's no reason a C program should use more memory.

    #!/opt/perl/bin/perl -l # A sieve based on base 6. Only possible primes are those numbers p # for which p mod 6 == 1, or p mod 6 == 5. # # Let n = 6 * k + l (0 <= l < 6). Then the bit b associated with n is # b = 2 * k + [undef, 0, undef, undef, undef, 1] -> [l] # In reverse, given a bit b, the corresponding number n is # n = 6 * int (b / 2) + [1, 5] -> [b % 2]. use strict; use warnings 'all'; sub scrub ($); sub findnext ($); my $max = $ARGV [0] || 1000; my @PRIMES = (2, 3); my $BASE = 6; # LCM of @PRIMES. my $BITS_BASE = 2; my $BITS_BYTE = 8; my $bits = $BITS_BASE * int ($max / $BASE) + ($max % $BASE ? $BITS_BA +SE : 0); my $bytes = int ($bits / $BITS_BYTE) + ($bits % $BITS_BYTE ? 1 : 0); my $sieve = eval qq !"\xFF" x $bytes!; # Half the memory usuage. # Numbers less than the base are special. foreach my $prime (@PRIMES) { print $prime } my $i = 5; for (; $i && $i <= sqrt ($max); $i = findnext $i) { print $i; scrub $i; } for (; $i && $i < $max ; $i = findnext $i) { print $i; } exit; my ($offset1, $offset2); INIT { $offset1 = [undef, 0, undef, undef, undef, 1]; $offset2 = [1, 5]; } # Scrub out all multiples of the given argument. It's easy to see that # all multiples less than the square already have been crossed out, an +d # we only have to scrub out the odd multiples. sub scrub ($) { use integer; my $n = shift; my $m = $n; if ($m % 3 == 1) { my $c = $n * $m; vec ($sieve, $BITS_BASE * ($c / $BASE) + $offset1 -> [$c % $BA +SE], 1) = 0; $m += 4; } my $c = $n * $m; # $b is the bit index in the sieve, $b1 and $b2 are the increments # in the loop. my $b = $BITS_BASE * ($c / $BASE) + $offset1 -> [$c % $BASE]; my $b1 = (2 * $n / ($BASE / $BITS_BASE)); my $b2 = (4 * $n / ($BASE / $BITS_BASE)); # Exactly one of $b1, $b2 needs to be incremented by 1, # depending on the parity of $c. ($c % 6 == 1 ? $b2 : $b1) += 1; # Make sure we don't go out of range. my $l = $bits - $b1 - $b2; # Core loop, we want to minimize work here. while ($b <= $l) { vec ($sieve, $b, 1) = 0; $b += $b1; vec ($sieve, $b, 1) = 0; $b += $b2; } # Basically a copy of the previous loop, but with extra checks. # Checks don't hurt now as this loop is performed only once. { last if $b > $bits; vec ($sieve, $b, 1) = 0; $b += $b1; last if $b > $bits; vec ($sieve, $b, 1) = 0; $b += $b2; } } # Find the next prime number after the given number. sub findnext ($) { use integer; my $n = shift; my $i = $BITS_BASE * ($n / $BASE) + $offset1 -> [$n % $BASE] + 1; while ($i <= $bits) { if (vec ($sieve, $i, 1)) { return $BASE * ($i / $BITS_BASE) + $offset2 -> [$i % $BITS +_BASE] } $i ++ } undef; } __END__

    Takes 6 minutes on my laptop to find all primes less than 100 million.

    Abigail

      I'm baffled that Perl beats C in anything. How could it, given that perl is written in C?

      A perl program can (in some cases) beat a C program for the same task given the same amount of programmer effort, because there are optimisations built into the C code for perl that encapsulate more effort than you'd normally ever consider worthwhile for a single application.

      Given unbounded effort, the C program will always at least match the perl program. Few people or organisations have the resources for that, though.

      Hugo
Re: Efficiency comparisons
by jasonk (Parson) on Feb 24, 2003 at 21:04 UTC

    I'd say that efficiency is going to be influenced in no small part by the programmer more than the language. Especially since she seems to have done the same experiment two years ago, and at that time concluded that Perl was the least efficient for her numerical test, losing even to BASIC.

      I agree with you, although I'd put the emphasis on the algorithm rather than the programmer. In this case, since the same programmer wrote all the programs, presumably using well-known algorithms, I thought it would come out in the wash. The fact that the two-year-old results directly contradict these can mean either that her overall results are not valid, or that she's learned Perl a lot better in the meantime. ;-)

Re: Efficiency comparisons
by Limbic~Region (Chancellor) on Feb 24, 2003 at 21:03 UTC
    Sadly, I couldn't find the experiment using the oracle.
    I however did get her email.
    I found the address, along with some of her artwork at this url.

    Let us know if you find the code - happy hunting.
    Cheers - L~R

    Update: At another monk's suggestion, I have removed the obfu'd email address in the node, though it can still be found at the link I provided.

Re: Efficiency comparisons
by gmpassos (Priest) on Feb 27, 2003 at 20:26 UTC
    Well, the only problem for memory in Perl, is the variables, that are always string based, using more memory than typed languages. But if you always use my, and separate your code in subs, to not write again the same part (yes exist peoples that do this), Perl will manage the mamory and clean what is not used.

    O good thing of Perl is that it has an memory area for the variables. And it cleans this are to can be used for new variables. C doesn't do this very well automatically, but is possible (since Perl is writed in C).

    But a good thing exist in Perl to force the use of my: use strict ;-P

    About the other laguage, Java, that is suposed to be a good language. Well, Java in theory is good, but in pratic not. First, is very slow (even the Java version from SUN). Second, the classes are not very well structured, and is a big problem, since every thing comes from a class. Second, the bytecode of .class, that is suposed to make the load faster, isn't! The JVM remake the bytecode for the JVM (OS) in use, and than run. The class bytecode is only a ugly way to try to hide the source (but every one can get a Java-DesAssembler). And I don't know why it's os slow!!! It's a typed language, should to be faster than Perl (but Perl is very well coded, and exist since 87)! The documentation of the classes is a mess too. I think that Java has a lot of things to learn with Perl, they don't copy (more) use because they don't want.

    For me, Java is a film with a good script, but with a bad shot. The objective of a language of 6th level (Java is 6, Perl is 15. see: http://www.theadvisors.com/langcomparison.htm) is to develope code faster and have a better product, since we have more time to make better things, and will be better to fix it. In other words, you can't forgot the final objective!

    Graciliano M. P.
    "The creativity is the expression of the liberty".