Re^4: Simple line parse question

Your experience of 3x-10x:1 feels about right to me.

Regarding assembly language: I'll have to agree ... mostly.

For embedded systems with small processors, you can still beat a compiler once you know enough about the chip and the application. On larger computers with modern CPUs, when starting from scratch, you're dead on--it's hard to beat a good compiler. However, when I dive into assembly language on a modern CPU, I don't start from scratch, I let the compiler generate the first pass for me. Then I dig into the processor manuals, and examine the algorithm and then improve the code from there. (And only if I can find a bottleneck I can reasonably expect to improve.)

When I started programming in assembler (Z-80, 6502, 68000, 8051 days), it was easy to beat a compiler because compilers weren't that good (generally), and the CPU timings were easy to understand. You could read the instruction timings and generally have a good shot at improving the speed on your first attempts. This is still true on most small embedded systems (AVR, PIC, etc.).

Once caching became popular, things started to get "interesting". You had to understand how your code, the cache and history affected things. With caching, the timings became a bit trickier, as code and data access speed changed. So you could make a decent guess about how to improve the algorith, but you could easily be surprised when things didn't work the way you guessed they would. You needed more guesswork and insight into your algorithm to make significant speed improvements. So larger embedded systems using these processors (80386, ARM, etc.) are harder to improve, and the compilers for them tend to be smarter.

When the Pentium came out with the I and V pipes, things got downright hard. There was enough complexity in timing, multiple cache levels and figuring out how to maximize the use of the I and V pipelines that guessing how to improve execution speed involves a lot of guesswork. At this time, you'd have to really chew on the problem, and you had to measure things frequently. You could no longer rely on the instruction timings to get you a good guess unless you really had a feel for how everything interacted. Even then it was finicky. Of course these processors are so fast that dropping down into assembler is much less common.

Now with speculative execution, branch prediction, etc., I'm not sure I could beat a compiler. And even if I did, it would be the result of many guesses and experiments. I've not had a need to improve the code for any CPU more advanced than a Pentium II, so I don't know how much more difficult it is to optimize code in that environment. I guess I'll have to find some time to play around with my Atom, Athlon XP and Pentium IV computers and see what I can do...

And I've totally ignored the introduction of multitasking, too. Once that came in, you had no idea how other applications were going to impact yours. Compilers, too, became a lot better. Now, on the (very rare) occasions when I drop into assembly, it's either an embedded system that's easy to understand and where I *truly* need the speed, or it's just (a) pure fun, (b) a challenge for myself, and/or (c) an exercise to keep myself sharp.

...roboticus

Comment on Re^4: Simple line parse question

Replies are listed 'Best First'.
Re^5: Simple line parse question by Marshall (Canon) on Aug 24, 2010 at 17:37 UTC
I think we are on the same page here. The compilers have become very good, especially the expensive ones like the Intel C compiler - I've seen some very impressive benchmarks. Some of this is hard to believe. At IBM one group was benchmarking an experimental FORTRAN compiler version and the benchmark was some humongous FFT thing. The results of the new compiler was almost instantaneous and everybody went oh, darn there must be some serious error here, because it just can't be that fast! As it turned out, the compiler figured out that this huge computation could never result in an output and it replaced the whole program with a single return statement! When things get that smart, it is even hard to benchmark them!	[reply]

Replies are listed 'Best First'.

Re^5: Simple line parse question
by Marshall (Canon) on Aug 24, 2010 at 17:37 UTC

Some of this is hard to believe. At IBM one group was benchmarking an experimental FORTRAN compiler version and the benchmark was some humongous FFT thing. The results of the new compiler was almost instantaneous and everybody went oh, darn there must be some serious error here, because it just can't be that fast! As it turned out, the compiler figured out that this huge computation could never result in an output and it replaced the whole program with a single return statement! When things get that smart, it is even hard to benchmark them!

[reply]