in reply to Re^3: Weird performance issue with Strawberries and Inline::C
in thread Weird performance issue with Strawberries and Inline::C
Sorry, hopefully this will be the last instalment on that same PWC 342-2 -- because mission (kind of) accomplished, with symbolic and "all-important" order-of-magnitude speed gain over "plain" C on a single core, through a little re-arrangement, partial loop unrolling (step in 96 bytes instead of 32: sums of "-1" and "1"s won't overflow and can be kept as bytes a little longer) and different choice for some instructions (TIMTOWTDI, there's real zoo of them). Apologies (can't edit parent), also, for calling AVX2 as "2017+", of course it's older; and using "len" in place of "aligned_len" above, once: it doesn't affect speed nor result, but is just unclean.
String length: 10,000 Rate/s % c 88080 100 va_single 811975 922 String length: 100,000 Rate/s % c 8945 100 va_single 108085 1208 String length: 1,000,000 Rate/s % c 894 100 va_single 10802 1208 String length: 10,000,000 Rate/s % c 88.4 100 va_single 913.5 1033 String length: 100,000,000 Rate/s % c 8.8 100 va_single 83.4 946
However, no gain (compared to "unoptimised" version in parent node) with OMP (same CPU with 4 cores):
String length: 100,000,000 Rate/s % c 8.8 100 va_omp 168.7 1910
That, and relative decrease in advantage for very long input (no additional memory allocation occurs, but only the same simple processing of string chunks, sequentially), I can only speculate is result of throttling of some kind.
And by the way, I also did try to place "mscore_c" C function in separate file, then compile it with "-O3 -march=native" (can't do it with Inline::C, can I?) which would optimise/vectorise to the best of compiler's "voodoo", as googling suggests; then link the library and call the wrapped function from Inline::C. Well, for uniform '"1"s only' string it gave ~25% gain, but for a "random" string it is 3 times slower (than "-O2" i.e. compiled directly from within Inline::C.) This is ridiculous, "voodoo", "A.I", "free" optimisations, and what not.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^5: Weird performance issue with Strawberries and Inline::C
by Anonymous Monk on Nov 12, 2025 at 13:53 UTC | |
|
Re^5: Weird performance issue with Strawberries and Inline::C
by Anonymous Monk on Nov 17, 2025 at 14:54 UTC |