Thanks for your post. I agree with everything that you have written. I neglected to say that the code that I am testing has been around for a while. My first version of it took 32 hours to run. Through refining algorithms I was able to reduce it down to about 12 hours. I then ran it through Devel::NYTProf and based on a half a dozen or so iterations and algorithmic changes reduced the run time to about 8 hours.
At this point, I'm being greedy and seeing what else I can get. The vast majority of the time is spent in the LMDB driver writing to the database in this case. This accounts for about 80% of the run time. The next chunk is about 10% for Sereal to serialize the HashRef which is written to the db. The next chunk after that is about 5% to parse and analyze the input data into the HashRef in the previous chunk. The last 5% covers the reading of the input files and other miscellaneous.
My belief, based upon observing perl magic at a distance, is that between 5.28, usemyalloc and O3 that there is a net improvement on I/O, XS Integration and complier optimization that gets me to down to 6 hours.
If I just applied the last perl version and compiler optimization, I would only be down to 24 hours from 32. That vast majority of getting from 32 to 6 hours, all but 2 hours of the reduction, is due to algorithmic improvement.
I am somewhat concerned about the possibility of instability that you mentioned. In my experience, I have found O2 a reliable optimization level for gcc in general. I have run into problems with O3 where it helped on some code and actually made it worse on other code. One of the things that I love about App::perlbrew is that I can easily have multiple versions of perl installed. The version that I use every day is compiled with no additional flags. I do usually have one version available compiled with O2 for those programs where through testing I know that I receive a needed boost.
Thanks again for your advice!
lbe
In reply to Re^2: Compile perl for performance
by learnedbyerror
in thread Compile perl for performance
by learnedbyerror
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |