Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re^4: Behold! The power of recursion.

by jordanh (Chaplain)
on Oct 18, 2004 at 11:38 UTC ( [id://400118]=note: print w/replies, xml ) Need Help??


in reply to Re^3: Behold! The power of recursion.
in thread Behold! The power of recursion.

Benchmark cannot "know" what my hypothetical "good" compiler would do. Perhaps I'm optimistic about the state of compiler technology, but I see no reason why a good compiler wouldn't create the same code for both.

In the interest of disclosure, what compiler and optimization levels were used in the C case above?

Replies are listed 'Best First'.
Re^5: Behold! The power of recursion.
by tachyon (Chancellor) on Oct 18, 2004 at 22:19 UTC

    Benchmark cannot "know" what my hypothetical "good" compiler would do.

    And? It shows what *any* non hypothetical, can compile code now, real world useful compiler can do. Out in the real world hypothetical compilers producing theoretical code are less that totally useful. The results presented were for CL.EXE /O2 on a Win2K AMD K7 system. FWIW GCC does a significantly worse job of optimising the recursive code.

    By all means compile it with anything you like on any platform and present any evidence that supports your hypothesis.

    [root@devel3 root]# ./test.pl Benchmark: timing 1000000 iterations of iterative_c, iterative_p, recu +rsive_c, recursive_p... iterative_c: 1 wallclock secs ( 0.42 usr + 0.00 sys = 0.42 CPU) @ 2 +380952.38/s (n=1000000) iterative_p: 20 wallclock secs (18.34 usr + 0.00 sys = 18.34 CPU) @ 5 +4525.63/s (n=1000000) recursive_c: 1 wallclock secs ( 0.60 usr + 0.00 sys = 0.60 CPU) @ 1 +666666.67/s (n=1000000) recursive_p: 38 wallclock secs (37.75 usr + 0.00 sys = 37.75 CPU) @ 2 +6490.07/s (n=1000000) Rate recursive_p iterative_p recursive_c iterative_c recursive_p 26490/s -- -51% -98% -99% iterative_p 54526/s 106% -- -97% -98% recursive_c 1666667/s 6192% 2957% -- -30% iterative_c 2380952/s 8888% 4267% 43% -- [root@devel3 root]# gcc -v Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs gcc version 2.96 20000731 (Red Hat Linux 7.3 2.96-110)

    cheers

    tachyon

      I don't have inline installed on the Perl on the machine I'm typing this on, so I took the liberty of doing a pure C benchmark. I had to admit that I was, at first, surprised at the result, since I had heard that gcc does tail call elimination, then I found this article that explained why gcc was failing to perform this optimization. Subsequently, I changed guess_rec_c to read:
      int guess_rec_c( int ans, int lower, int higher ) { int guess; guess = (int)((lower + higher)/2); if (ans == guess) return guess; if ( guess > ans ) { return guess_rec_c( ans, lower, guess -1 ); } else { return guess_rec_c( ans, guess +1, higher ); } }
      ...and using:
      gcc -v
      Reading specs from /usr/lib/gcc-lib/i586-suse-linux/3.3.3/specs
      Configured with: ../configure --enable-threads=posix 
      --prefix=/usr --with-local-prefix=/usr/local 
      --infodir=/usr/share/info --mandir=/usr/share/man 
      --enable-languages=c,c++,f77,objc,java,ada 
      --disable-checking --libdir=/usr/lib --enable-libgcj 
      --with-gxx-include-dir=/usr/include/g++ --with-slibdir=/lib 
      --with-system-zlib --enable-shared --enable-__cxa_atexit 
      i586-suse-linux
      Thread model: posix
      gcc version 3.3.3 (SuSE Linux)
      
      with the compiler optimization -O3, the recursive version was slightly faster than the iterative version:
      jordan@linux-jordan:~> more guess.c int guess_it_c( int ans, int lower, int higher ) { int guess; for(;;) { guess = (int)((lower + higher)/2); if ( ans == guess ) break; if ( guess > ans ) higher = guess -1; else lower = guess +1; } return guess; } int guess_rec_c( int ans, int lower, int higher ) { int guess; guess = (int)((lower + higher)/2); if (ans == guess) return guess; if ( guess > ans ) { return guess_rec_c( ans, lower, guess -1 ); } else { return guess_rec_c( ans, guess +1, higher ); } } int main() { int n; for (n=0; n<1000000; n++) VERSION(4999,1,10000); } jordan@linux-jordan:~> gcc -Wall -DVERSION=guess_it_c -O3 -o guess_it +guess.c jordan@linux-jordan:~> gcc -Wall -DVERSION=guess_rec_c -O3 -o guess_re +c guess.c jordan@linux-jordan:~> time ./guess_it real 0m0.291s user 0m0.285s sys 0m0.003s jordan@linux-jordan:~> time ./guess_rec real 0m0.284s user 0m0.276s sys 0m0.004s
      I think this demonstrates that the recursive version of a program can be as efficient as an iterative solution. You might point to the fact that I changed the code, but the code is now arguably more "correct" as the -Wall check would complain about the routines declared to return int and reaching the end without a return statement. The fact that gcc did not recognize the equivalence of the routine call and the routine call preceded by the return is a quality of implementation issue. Just as I said, there's no reason why the recursive version couldn't be compiled to yield an equivalent program. I am disappointed that more compilers don't automatically offer this obvious and simple optimization.

      Updated: Some formatting

        The reason for the optimisation failure is interesting, thanks very much for the link.

        cheers

        tachyon

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://400118]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (4)
As of 2024-03-28 17:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found