Greetings, a_salway! This is 10 years late! Recently, chuma requested similarly the Long List is Long challenge. For testing, I modified eyepopslikeamosquito's file generator to emit 16 million unique key-value pairs, in parallel. The key-names length are 8 chars minimum ~ 19 chars maximum, sufficient for uniqueness.
use v5.30;
use MCE;
{
# Generate a random word, inspired by eyepopslikeamosquito's
# gen-llil.pl generator https://perlmonks.org/?node_id=11148681
my $ordmin = ord('A');
my $ordmax = ord('z') + 1;
sub gen_random_word {
my $nchar = shift; # the number of random chars to append
my $word = "";
for (1 .. $nchar) {
my $ord = $ordmin + int( rand($ordmax - $ordmin) );
# try again if 91 [, 92 \, 93 ], 94 ^, 95 _, or 96 `
while ($ord > 90 && $ord < 97) {
$ord = $ordmin + int( rand($ordmax - $ordmin) );
}
$word .= chr($ord);
}
return $word;
}
}
srand(42);
MCE->new(
max_workers => MCE::Util::get_ncpu(),
sequence => [ 1, 16e6 ],
bounds_only => 1,
chunk_size => 8e3,
init_relay => 1,
posix_exit => 1,
user_func => sub {
my ($mce, $chunk_ref, $chunk_id) = @_;
my $seq_beg = $chunk_ref->[0];
my $seq_end = $chunk_ref->[1];
my $out = "";
# Worker seeds generator using internal seed and wid value.
# Compute similarly using the chunk_id value.
my $seed = abs(MCE->seed - ($chunk_id * 1e5)) % 1073741780;
srand($seed);
for ($seq_beg .. $seq_end) {
$out .= gen_random_word(int(rand(12) + 8));
$out .= "\t" . int(rand(1e4) + 1) . "\n";
}
# Write output directly and orderly via relay.
MCE::relay { print $out };
}
)->run;
I tried three programs for sorting the output similarly to the Long List is Long challenge. That is count descending and key names ascending. Parsort is a program included with GNU parallel. Another variant is mcesort using mini-MCE integrated into the code.
Testing was captured on an AMD Ryzen Threadripper 3970X machine, 3600 MHz DDR4. The output file md5sum is c6da8fd098664155c190782f1dc0ca7f.
$ rm -f out; time perl demo.pl | LC_ALL=C sort -k2rn > out
real 0m34.779s
user 1m39.925s
sys 0m00.555s
$ rm -f out; time perl demo.pl | LC_ALL=C parsort -k2rn > out
real 0m04.484s
user 1m17.538s
sys 0m00.575s
$ rm -f out; time perl demo.pl | LC_ALL=C mcesort -k2rn > out
real 0m04.287s
user 1m54.859s
sys 0m01.748s
Next, a C++ demonstration running llil4map_buf2.cc. Set SSO_LIMIT to 20 for best performance, ~ line 131. Key names plus null character greater than SSO_LIMIT are dynamically allocated using blocks of memory.
$ time perl demo.pl > out.in
real 0m01.396s
user 1m16.659s
sys 0m00.287s
$ rm -f out; ./llil4map2 out.in > out
llil4map start
use OpenMP SSO_LIMIT 12 SSO_LIMIT 20
use boost sort
get properties 0.166 secs 0.159 secs
map to vector 0.055 secs 0.077 secs
vector stable sort 0.201 secs 0.113 secs
write stdout 0.097 secs 0.068 secs
total time 0.521 secs 0.419 secs
count lines 16000000
count unique 16000000
|