Also I get the regex version about 18% faster than the reduce version.
Same here. There is ~ 20% faster between hash-reduce and hash-regex. I was referring to the array-reduce variant. It is quite fast (for me at least on two machines). To factor out the AMD machine, I tried the three MCE variants on an Intel machine. The CPU has 8 physical cores and 8 logical cores. There was a typo in the array-unpack-reduce example on PM; fixed.
Update: Added Array ForLoop results.
# C++ without fast_io
# https://www.perlmonks.org/?node_id=11152156
$ ./rtoa-pgatram-fixed t1.txt >f.tmp
read_input_files : 3999000 items
read file time : 0.213 secs
roman_to_dec time : 0.120 secs
output time : 0.809 secs
total time : 1.142 secs
# MCE Hash Reduce
# https://www.perlmonks.org/?node_id=11152073
# $output .= reduce { $a+$b-$a%$b*2 } @rtoa{ split //, uc($_) };
$ perl rtoa-mce-hash-reduce.pl t1.txt >m.tmp
rtoa pgatram start
time 1.245 secs
# MCE Hash Regex
# https://www.perlmonks.org/?node_id=11152160
# $output .= sum @r2d{ (lc $_) =~ /$re/go };
$ perl rtoa-mce-hash-regex.pl t1.txt >m.tmp
rtoa pgatram start
time 0.952 secs
# MCE Array Reduce
# https://www.perlmonks.org/?node_id=11152119
# $output .= reduce { $a+$b-$a%$b*2 } @rtoa[ unpack 'c*', uc($_) ];
$ perl rtoa-mce-array-reduce.pl t1.txt >m.tmp
rtoa pgatram start
time 0.657 secs
# MCE Array ForLoop
# https://www.perlmonks.org/?node_id=11152168
$ perl rtoa-mce-array-forloop.pl t1.txt >m.tmp
rtoa pgatram start
time 0.533 secs
I'm unable to build fast_io as the compiler on the mac lacks -std=c++20 support. I need to update the toolchain.
Apple clang version 11.0.3 (clang-1103.0.32.62)
error: invalid value 'c++20' in '-std=c++20'
From this session, it does not require a big box for Perl consuming many CPU cores to reach C++.
|