Re^2: Writing hashes as records to a CSV file (joining keys with slices)

> after %H = (%H,%$_) for @AoH you'll have the superset in keys %H

actually using slices and empty key lists is way faster

@H{keys %$_} = () for @$AoH;

DEMO:

use v5.12;
use warnings;
use Test::More;
use Benchmark qw/cmpthese/;

my $AoH;

for my $n_rec (1, 10,100,1000) {
    say;
    say "=== num of records is: ",$n_rec;
    $AoH = create_data(1,$n_rec);
    is_deeply(
              [sort &list_join],
              [sort &slice_join],
             );

    cmpthese(-1, {
                  'list_join'  => \&list_join,
                  'slice_join' => \&slice_join,
                 }
            );

}

done_testing;

sub list_join {
    my %H;
    %H = (%H,%$_) for @$AoH;
    return keys %H;
}

sub slice_join {
    my %H;
    @H{keys %$_}=() for @$AoH;
    return keys %H;
}

sub create_data {
    my ( $density,$records ) = @_ ;

    my @AoH;
    push @AoH, { map { rand 100 <= $density ? ("$_" => $_) :() } "A"..
+"ZZ" } for 1..$records;
    return \@AoH;
}
[download]

OUTPUT:

__DATA__ === num of records is: 1 ok 1 Rate list_join slice_join list_join 238532/s -- -65% slice_join 682713/s 186% -- __DATA__ === num of records is: 10 ok 2 Rate list_join slice_join list_join 7819/s -- -93% slice_join 112993/s 1345% -- __DATA__ === num of records is: 100 ok 3 Rate list_join slice_join list_join 82.9/s -- -99% slice_join 8533/s 10195% -- __DATA__ === num of records is: 1000 ok 4 Rate list_join slice_join list_join 3.66/s -- -100% slice_join 1067/s 29072% -- 1..4
[download]

Cheers Rolf
_{(addicted to the Perl Programming Language :)

Wikisyntax for the Monastery}

Update

fixed bug in sorted tests

Comment on Re^2: Writing hashes as records to a CSV file (joining keys with slices) Select or Download Code

Replies are listed 'Best First'.
Re^3: Writing hashes as records to a CSV file (joining keys with slices) by Tux (Canon) on Dec 09, 2021 at 14:06 UTC
I compared it to the two methods I used in my examples, just out of curiousity and to learn. I never used the list_join as that intuitively feels stupid/slow, but the slice_join was also not in my default tool-box, as `map` feels so much more logical to me. Anyway, here we go ... Read more... (1494 Bytes) Read more... (2 kB) I think the grep_map is a neat contender and reads easier than the slice_join. YMMV. Enjoy, Have FUN! H.Merijn	[reply] [d/l] [select]
Re^4: Writing hashes as records to a CSV file (joining keys with slices) by choroba (Cardinal) on Dec 09, 2021 at 21:41 UTC
And for the spirit of TIMTOWTDI: `sub map_direct { my %H = (map %$_, @$AoH); return keys %H } sub map_keys { my %H; @H{map keys %$_, @$AoH} = (); return keys %H }` [download] `map{substr$_->[0],$_->[1]\|\|0,1}[\\|\|{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^ARGV,3]`	[reply] [d/l] [select]
Re^5: Writing hashes as records to a CSV file (joining keys with slices) by Tux (Canon) on Dec 10, 2021 at 08:03 UTC
And the code and results (on my box) Read more... (2 kB) Read more... (6 kB) Enjoy, Have FUN! H.Merijn	[reply] [d/l] [select]
Re^4: Writing hashes as records to a CSV file (joining keys with slices) by LanX (Saint) on Dec 09, 2021 at 14:45 UTC
> I never used the list_join as that intuitively feels stupid/slow, Well, it's the canonic way to join two hashes. > I think the grep_map is a neat contender and reads easier than the slice_join. YMMV. I'm very skeptical about solutions with long intermediate lists like with the map. They might cause memory problems. see what happens with a density of 75%, and now imagine handling much more data, where the machine starts swapping. Read more... (3 kB) Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery} update fixed bugs in tests with sort, see also sort AoH buggy?	[reply] [d/l] [select]

Update

update