Re: Out Of Date Optimizations? New Idioms?
by BrowserUk (Patriarch) on Jul 20, 2003 at 10:38 UTC
|
Unfortunately, for most algorithms, the nature of perl 5 and the way memory is allocated pretty much preclude attempting to code algorithms so that they can make use of the optimisations available through the presence of large L1 and L2 caches. If you don't control the allocation of memory, you have little chance of utilising the benefits that can accrue from processor caches, except accidentally.
I've recently been rediscovering the art (and joy) of coding stuff using macro assembler. Full-blown windows GUI applications in executables of less that 20k, and memory footprints even smaller than typical by an even greater margin. Processing speeds that take your breath away even on my lowly 233Mhz. The one area that requires a completely new mindset from the last time I used assembler, is trying to utilise caching and pipelining to good effect. It's a whole new art that simple didn't exist the last time I played with this stuff.
Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
| [reply] |
|
Under the heading of 'Nothing new under the sun', I'd point out that both caches and pipelines have been around a very long time---but you knew that! I have to admit that it is a hell of lot of fun being able to play around with optimization techniques like this on hardware that I can actually afford!!
--hsm
"Never try to teach a pig to sing...it wastes your time and it annoys the pig."
| [reply] |
|
| [reply] |
|
| [reply] |
|
--
Tommy Butler, a.k.a. Tommy
| [reply] |
Re: Out-Of-Date Optimizations? New Idioms? RAM vs. CPU
by Abigail-II (Bishop) on Jul 21, 2003 at 07:19 UTC
|
I'd say that algorithms that favour memory over CPU
benefit more from CPU caches than the other
way around. The larger the cache, the faster the average
memory access is, because it decreases the chance of a
cache miss.
Abigail | [reply] |
|
| [reply] |
|
If you recalculate something, so that the something
doesn't stay in memory, it won't stay in the cache either.
The cache is a memory cache - what's there is also in
the main memory.
CPU's have become faster, but main memories have become bigger.
Nowadays, computers tend not to swap; if your server swaps
on a regular basis, you might want to do some tuning.
Memory I/O is faster than disk I/0, and the ratio
memory I/0 / disk I/0 is more than the ratio cache / memory.
I/0.
Maybe not much of a data point, but from the servers with
resource problems I've seen, more of them benefitted from
getting more memory, than more or faster CPUs. Most computers
have more than enough CPU cycles - but usually they can use
more main memory.
Abigail
| [reply] |
|
|
|
|
|
A better way to improve usage of cache without going through a lot of careful tuning is to keep actively accessed data together, and avoid touching lots of memory randomly.
My understanding (from my view somewhere in the bleachers) is that Parrot's garbage collection will provide both benefits.
Incidentally correcting a point you made in your original post, the importance of Parrot having lots of registers is not to make efficient use of cache. It is to avoid spending half of the time on stack operations (estimate quoted from my memory of elian's statement about what JVM and .NET do). In a register-poor environment, like x86, you come out even. In a register-rich environment you win big. (Yes, I know that x86 has lots of registers - but most are not visible to the programmer and the CPU doesn't always figure out how to use them well on the fly.)
Before someone pipes up and says that we should focus on x86, Parrot is hoping to survive well into the time when 32-bit computing is replaced by 64-bit for mass consumers. Both Intel and AMD have come out with 64-bit chips with far more registers available to the programmer than x86 has. That strongly suggests that the future of consumer computing will have lots of registers available. (Not a guarantee though, the way that I read the tea leaves is that Intel is hoping that addressing hacks like PAE will allow 32-bit computing to continue to dominate consumer desktops through the end of the decade. AMD wants us to switch earlier. I will be very interested to see which way game developers jump when their games start needing more then 2GB of RAM.)
| [reply] |
|
|
Re: Out-Of-Date Optimizations? New Idioms? RAM vs. CPU
by hawtin (Prior) on Jul 21, 2003 at 08:10 UTC
|
I am reminded of the original development of RISC machines.
Those guys found that the only way to find the "best"
approach was to take a sufficiently large example set (booting
UNIX and running some programs) and run it in a well
measured simulation.
When someone proposed a change to the architecture they
changed the simulation, measured the effect on performance
and kept it if it improved the situation.
My guess would be (based admittedly on a total lack of
first hand knowledge) that adding metrics to Parrot (or
Perl 5?) that
emulated the effect of fetching information from disk/ RAM/
cache would lead to a reasonable simulation of Perl's
performace and hence could definitively answer these type
of questions.
This would allow anyone with enough interest and time
on their hands to get a real answer rather than just
sticking a wet finger in the air.
| [reply] |
Re: Out-Of-Date Optimizations? New Idioms? RAM vs. CPU
by mr_mischief (Monsignor) on Jul 21, 2003 at 19:25 UTC
|
Keep in mind that any language (or other software system) which uses double indirection and dynamic code evaluation runs into issues of optimizing for memory and of maximizing cache hits. In fact, evaluation of dynamically-generated code often requires a complete flushing of the cache. The best we can hope for is to minimize ill effects of the neat tools modern languages give us. We can't ask for things to improve programmer time which are contrary to the needs of the hardware and expect to also have the software ideally suited to the hardware's implementation.
Christopher E. Stith | [reply] |
Re: Out-Of-Date Optimizations? New Idioms? RAM vs. CPU
by Juerd (Abbot) on Aug 09, 2003 at 16:35 UTC
|
So I was wondering if there are monks out there that have found any (new) coding idioms in (today's) Perl that favor using CPU over memory, and have found them to be faster than a (old) coding idiom (favoring memory over CPU).
Quite often, a non-ST sort is faster than a ST. It depends on the data, the machine and the resources available on that machine. But many people (including me, I noticed) write STs without thinking or benchmarking.
Juerd
# { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }
| [reply] |
|
| [reply] |
|
I don't know about you but I take care to use Guttman-Rossler if I can instead of the full ST.
I never knew this technique had a different name. Although I often write a full ST, when I really need speed, this is what I already do. It's good to know that it has a name. Thanks.
Juerd
# { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }
| [reply] |