in reply to shell vs. filehandler

Common sense tells us that reading the file with Perl would be much faster than launching an entirely new process, reading the file, dumping it to STDOUT and then reading it from there. But just to be sure, whenever you want to find out which thing is faster, fire up Benchmark.

perl -MBenchmark -e 'timethese( 1000, { cat => sub { my $str = qx|cat /usr/share/dict/words|; }, read => sub { local $/; open my $f, "/usr/share/dict/words" or die $!; my $str = <$f>; } } )' Benchmark: timing 1000 iterations of cat, read... cat: 6 wallclock secs ( 1.69 usr 1.34 sys + 0.28 cusr 2.34 csys = 5.65 CPU) @ 330.03/s (n=1000) read: 1 wallclock secs ( 0.58 usr + 0.58 sys = 1.16 CPU) @ 862.07/s (n=1000)

Replies are listed 'Best First'.
Re^2: shell vs. filehandler
by 10basetom (Novice) on Feb 11, 2005 at 05:38 UTC
    thank you for the feedback, guys... i will go with the second method. before reading about benchmark, i timed the two methods using unix's time command. here are the results:
    cat: real 0.3 user 0.1 sys 0.2 read: real 0.2 user 0.0 sys 0.1
      Hi. Due to the small number of instructions executed, both scripts are too small for a trustable benchmark.

      real measures how much time the program took to execute, user how much time took in user mode, and sys, how much in kernel mode.

      Notice that your system may be busy, so there may be many processes changing their context.

      This (which is part of the so called system entropy) may vary in a very short time, and as your scripts runs also for a very short period of time, entropy may affect them in different ways.

      This may cause the fastest script to appear as the lowest.

      Try to rewrite them opening a reasonable ammount of files, so you can compare times of execution more accurately.
Re^2: shell vs. filehandler
by CountZero (Bishop) on Feb 11, 2005 at 07:29 UTC
    Indeed running a benchmark is the only way to be sure, but ... looks can be deceiving as repeatedely running a script which reads a file may "suffer" from caching effects in the OS; i.e. the first time you read the file may take a relatively long time and all reads thereafter may be from the cache and hence much faster.

    CountZero

    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law