Benchmark cannot "know" what my hypothetical "good" compiler would do.
And? It shows what *any* non hypothetical, can compile code now, real world useful compiler can do. Out in the real world hypothetical compilers producing theoretical code are less that totally useful. The results presented were for CL.EXE /O2 on a Win2K AMD K7 system. FWIW GCC does a significantly worse job of optimising the recursive code.
By all means compile it with anything you like on any platform and present any evidence that supports your hypothesis.
[root@devel3 root]# ./test.pl
Benchmark: timing 1000000 iterations of iterative_c, iterative_p, recu
+rsive_c, recursive_p...
iterative_c: 1 wallclock secs ( 0.42 usr + 0.00 sys = 0.42 CPU) @ 2
+380952.38/s (n=1000000)
iterative_p: 20 wallclock secs (18.34 usr + 0.00 sys = 18.34 CPU) @ 5
+4525.63/s (n=1000000)
recursive_c: 1 wallclock secs ( 0.60 usr + 0.00 sys = 0.60 CPU) @ 1
+666666.67/s (n=1000000)
recursive_p: 38 wallclock secs (37.75 usr + 0.00 sys = 37.75 CPU) @ 2
+6490.07/s (n=1000000)
Rate recursive_p iterative_p recursive_c iterative_c
recursive_p 26490/s -- -51% -98% -99%
iterative_p 54526/s 106% -- -97% -98%
recursive_c 1666667/s 6192% 2957% -- -30%
iterative_c 2380952/s 8888% 4267% 43% --
[root@devel3 root]# gcc -v
Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs
gcc version 2.96 20000731 (Red Hat Linux 7.3 2.96-110)
| [reply] [Watch: Dir/Any] [d/l] |
I don't have inline installed on the Perl on the machine I'm typing this on, so I took the liberty of doing a pure C benchmark. I had to admit that I was, at first, surprised
at the result, since I had heard that gcc does tail call elimination, then I found this article that explained why gcc was failing to perform this optimization. Subsequently, I changed guess_rec_c to read:
int guess_rec_c( int ans, int lower, int higher ) {
int guess;
guess = (int)((lower + higher)/2);
if (ans == guess) return guess;
if ( guess > ans ) {
return guess_rec_c( ans, lower, guess -1 );
}
else {
return guess_rec_c( ans, guess +1, higher );
}
}
...and using:
gcc -v
Reading specs from /usr/lib/gcc-lib/i586-suse-linux/3.3.3/specs
Configured with: ../configure --enable-threads=posix
--prefix=/usr --with-local-prefix=/usr/local
--infodir=/usr/share/info --mandir=/usr/share/man
--enable-languages=c,c++,f77,objc,java,ada
--disable-checking --libdir=/usr/lib --enable-libgcj
--with-gxx-include-dir=/usr/include/g++ --with-slibdir=/lib
--with-system-zlib --enable-shared --enable-__cxa_atexit
i586-suse-linux
Thread model: posix
gcc version 3.3.3 (SuSE Linux)
with the compiler optimization -O3, the recursive version was slightly faster than the iterative version:
jordan@linux-jordan:~> more guess.c
int guess_it_c( int ans, int lower, int higher ) {
int guess;
for(;;) {
guess = (int)((lower + higher)/2);
if ( ans == guess ) break;
if ( guess > ans )
higher = guess -1;
else
lower = guess +1;
}
return guess;
}
int guess_rec_c( int ans, int lower, int higher ) {
int guess;
guess = (int)((lower + higher)/2);
if (ans == guess) return guess;
if ( guess > ans ) {
return guess_rec_c( ans, lower, guess -1 );
}
else {
return guess_rec_c( ans, guess +1, higher );
}
}
int main() {
int n;
for (n=0; n<1000000; n++)
VERSION(4999,1,10000);
}
jordan@linux-jordan:~> gcc -Wall -DVERSION=guess_it_c -O3 -o guess_it
+guess.c
jordan@linux-jordan:~> gcc -Wall -DVERSION=guess_rec_c -O3 -o guess_re
+c guess.c
jordan@linux-jordan:~> time ./guess_it
real 0m0.291s
user 0m0.285s
sys 0m0.003s
jordan@linux-jordan:~> time ./guess_rec
real 0m0.284s
user 0m0.276s
sys 0m0.004s
I think this demonstrates that the recursive version of a program can be as efficient as an iterative solution. You might point to the fact that I changed the code, but the code is now arguably more "correct" as the -Wall check would complain about the routines declared to return int and reaching the end without a return statement.
The fact that gcc did not recognize the equivalence of the routine call and the routine call preceded by the return is a quality of implementation issue. Just as I said, there's no reason why the recursive version couldn't be compiled to yield an equivalent program. I am disappointed that more compilers don't automatically offer this obvious and simple optimization.
Updated: Some formatting | [reply] [Watch: Dir/Any] [d/l] [select] |
| [reply] [Watch: Dir/Any] |