meonkeys has asked for the wisdom of the Perl Monks concerning the following question:

A coworker claimed that "Java could beat Perl on any platform in this simple test: read a file in to a variable, write the contents of the variable to another file", so I decided to take him up on the challenge. I use Red Hat Linux 7.0 on a Pentium III 500Mhz; Perl 5.005_03, Sun's Java 1.3.1.

Here's the Perl I used:
#!/usr/bin/perl -w use Time::HiRes qw( time ); undef $/; $| = 0; my $file; my $start = time(); open(FILE, $ARGV[0]); $file = <FILE>; close FILE; open(FILE, ">$ARGV[1]"); print FILE $file; close FILE; my $total = time() - $start; print "$total Seconds Elapsed.\n";
Some sample runs:
[adamm@pepsi java_vs_perl]$ dd if=/dev/zero of=ten_meg_file.txt seek=1 +0 count=1 bs=1M 1+0 records in 1+0 records out [adamm@pepsi java_vs_perl]$ perl -w cp.pl ten_meg_file.txt out 0.405324935913086 Seconds Elapsed. [adamm@pepsi java_vs_perl]$ perl -w cp.pl ten_meg_file.txt out 0.322497963905334 Seconds Elapsed. [adamm@pepsi java_vs_perl]$ perl -w cp.pl ten_meg_file.txt out 0.321141004562378 Seconds Elapsed. [adamm@pepsi java_vs_perl]$ perl -w cp.pl ten_meg_file.txt out 0.318119049072266 Seconds Elapsed.
I realize that subsequent reads may be cached, resulting in faster times.

And the Java (sorry, I know this isn't Javamonks):
import java.io.*; public class X { public static void copy(String from, String to)throws IOException{ InputStream inFile = null; OutputStream outFile = null; try { inFile = new FileInputStream(from); outFile = new FileOutputStream(to); int length = inFile.available(); byte[] bytes = new byte[length]; inFile.read(bytes); outFile.write(bytes); } finally { inFile.close(); outFile.close(); } } public static void copyBuf(String from, String to)throws IOException +{ InputStream in = null; OutputStream out = null; InputStream inFile = null; OutputStream outFile = null; try { inFile = new FileInputStream(from); in = new BufferedInputStream(inFile); outFile = new FileOutputStream(to); out = new BufferedOutputStream(outFile); int length = in.available(); byte[] bytes = new byte[length]; in.read(bytes); out.write(bytes); } finally { in.close(); out.close(); } } public static void main(String[] args) { try { long l = System.currentTimeMillis(); copy(args[0], args[1]); System.out.println("Time = "+ (System.currentTimeMillis()-l)); l = System.currentTimeMillis(); copyBuf(args[0], args[1]); System.out.println("Buf Time = "+ (System.currentTimeMillis()-l) +); } catch (IOException e) { e.printStackTrace(); } } }
And some sample runs:
[adamm@pepsi java_vs_perl]$ javac X.java [adamm@pepsi java_vs_perl]$ dd if=/dev/zero of=ten_meg_file__2.txt see +k=10 count=1 bs=1M 1+0 records in 1+0 records out [adamm@pepsi java_vs_perl]$ java X ten_meg_file__2.txt out__2 Time = 1205 Buf Time = 815 [adamm@pepsi java_vs_perl]$ java X ten_meg_file__2.txt out__2 Time = 722 Buf Time = 838 [adamm@pepsi java_vs_perl]$ java X ten_meg_file__2.txt out__2 Time = 814 Buf Time = 779
These times are in milliseconds, and indicate that Perl was about twice as fast as Java at this simple operation. So now, my questions.

Thanks!

---
"A Jedi uses the Force for knowledge and defense, never for attack."

Replies are listed 'Best First'.
Re: Java vs. Perl file I/O benchmarks; valid test?
by grinder (Bishop) on Feb 28, 2002 at 20:07 UTC
    A reasonable explanation of why Perl is faster is that you are using the expression
    $file = <FILE>;

    This is going to be handled by an opcode or two, that will run at C speed. On the other hand, the Java implementation is forced to use an explicit loop to fetch all the contents of the file. Other things being equal, this will kill Java.

    What does really strike me, though, is the fact that the Java version is about three times as long. In terms of programmer efficiency, that's an important factor to take into consideration. Much more so that raw I/O throughput.

    I'm no expert at understanding Perl's op codes, but if you do a perl -MO=Terse copyfile you can get a feeling for what is going on under the hood.


    Dwelling on this overnight, it occurred to me that the java implementation is not following the spec "read a file in to a variable". The Perl implementation is slurping the entire file into a variable, which puts considerable effort on the VM manager. The Java implementation, on the other hand, is reading a bit from the input file, and then writing a bit to the output file, until the input file is exhausted.

    At no time does the Java version hold an entire copy of the file in core, whereas Perl does, paying the added cost of system overhead while the kernel frantically discards 10 megabytes of pages used for caching and buffering, in order to give them to the Perl process.</>

    A fairer comparison would be to use this code:

    open(IN, $ARGV[0]); open(OUT, ">$ARGV[1]"); print OUT <IN>; close IN; close OUT;

    Looking at it that way, it's clear the Perl programmer would have already typed in the program, corrected spurious syntax errors, run the program, received the results, before the Java programmer had finished typing in the Java program.


    print@_{sort keys %_},$/if%_=split//,'= & *a?b:e\f/h^h!j+n,o@o;r$s-t%t#u'
      A couple things. I don't see the looping that you're talking about in the java versions. The java version from the original poster provides 2 methods for copying, and both read the whole contents of the file into an array and then write that array out to a new file. So the overall idea *is* the same: read the entire first file into memory, write that memory to a second file.

      Also, you do realize that the java "version" is actually 2 different methods? So it's not really as long as it appears. Additionally, condensing the JDK 1.4 version I gave below down to some hideuos anonymous objects it gets pretty short. Not on par with the perl version, but hey, that's one of perl's features.

      import java.io.*; import java.nio.*; import java.nio.channels.*; public class X2 { public static void main (String [] args) throws Exception { FileChannel in = new FileInputStream(args[0]).getChannel(); new FileOutputStream(args[1]).getChannel().write(in.map( FileChannel.MapMode.READ_ONLY, 0, in.size())); } }
      Cheers,

      - danboo

Re: Java vs. Perl file I/O benchmarks; valid test?
by simon.proctor (Vicar) on Feb 28, 2002 at 20:12 UTC
    Assuming the files aren't too big you could have also done this:
    UNTESTED
    use strict; use warnings; my $data; open(FILE,"<yourfile") || die "oops $!"; { local $/ = undef; $data = <FILE>; } close(FILE); open(FILE2,">file2"); print FILE2 $data; close(FILE2);
    I program in Java as well and I can tell you it will nearly always be slower for most things. Note the nearly and most. I'm sure a threaded program in Java would give Perl a run for its money.

    However, thats life, thats programming, live with it. Personally I hate these kind of comparisons. People seem to forget theres fast.... and theres fast enough. If you can code a program quickly (3 days say) and its only 10% slower than a program that takes 3 weeks to build then frankly you could build it, run it and go home before the other one is done.

    Of course maintainability, scalability and coding standards are all factors and they should be factored into the build times too. Something that in my experience people forget.

    Just my little rant :P.
Re: Java vs. Perl file I/O benchmarks; valid test?
by steves (Curate) on Mar 01, 2002 at 10:09 UTC

    You might also be interested in this recent article discussing natively compiled Java.

Re: Java vs. Perl file I/O benchmarks; valid test?
by danboo (Beadle) on Feb 28, 2002 at 21:07 UTC
    I wonder if the new capabilities of the newly released JDK 1.4 would affect the outcome. It now supports memory mapped I/O. See the 'java.nio.*' package.

    Cheers,

    - danboo

      Ok, i whipped up a version using JDK 1.4 and the java.nio package. It's faster than either of the other versions on my system.

      [danboo@danboo 04:37pm ~/java]$ java X ten_meg_file__2.txt out__2 Time = 492 Buf Time = 432 [danboo@danboo 04:37pm ~/java]$ perl X.pl ten_meg_file__2.txt out__2 0.192673087120056 Seconds Elapsed. [danboo@danboo 04:37pm ~/java]$ java X2 ten_meg_file__2.txt out__2 Time = 128
      Source for X2.java:
      import java.io.*; import java.nio.*; import java.nio.channels.*; public class X2 { public static void main (String [] args) { try { long l = System.currentTimeMillis(); FileInputStream input = new FileInputStream(args[0]); FileOutputStream output = new FileOutputStream(args[1]); FileChannel inputChannel = input.getChannel(); FileChannel outputChannel = output.getChannel(); int inputLength = (int) inputChannel.size(); MappedByteBuffer buffer = inputChannel.map(FileChannel.MapMode.R +EAD_ONLY, 0, inputLength); outputChannel.write(buffer); System.out.println("Time = "+ (System.currentTimeMillis()-l)); } catch (Exception e) { e.printStackTrace(); } } }
      Note: This is my first time using the new 'java.nio' package for memory mapped file operations. There may be better ways to do it. I don't necessarily believe these tests are accurate.

      Cheers

      - danboo

      For those who are interested look at the java.nio.channel for a comparison. The reading on the FileChannel class is interesting as it covers this specific topic.

      It does some quite nice things with optimising access to large files. However as I have only just downloaded it I couldn't say if its great or quicker than Perl even. Not that I really mind either way :)
(tye)Re: Java vs. Perl file I/O benchmarks; valid test?
by tye (Sage) on Mar 01, 2002 at 17:20 UTC

    Note that on some platforms Perl's I/O is faster than C's buffered I/O (about twice as fast) which probably means it will be faster than Java's I/O. On other platforms, Perl's buffered I/O is about 1/2 as fast as C's buffered I/O. Linux is one of the platforms were Perl's I/O is slow.

    If "perl -V" reports "d_stdstdio=define", then Perl's I/O is probably fast on your platform.

    So your friend picked one of the things that Perl can be very fast at because Perl does really scary hacks to make I/O fast but only on platforms where the Configure tests show that those scary hacks actually work.

            - tye (but my friends call me "Tye")
Re: Java vs. Perl file I/O benchmarks; valid test?
by perrin (Chancellor) on Feb 28, 2002 at 19:54 UTC
    Frankly, your co-worker is an idiot. Obviously Perl will be faster at this because it doesn't have all that OO baggage as a requirement for a simple program like this. Simpler is usually equal to faster, and this is a case where the perl code to do the job is MUCH simpler.