in reply to Re: Writing hashes as records to a CSV file
in thread Writing hashes as records to a CSV file

> after %H = (%H,%$_) for @AoH you'll have the superset in keys %H

actually using slices and empty key lists is way faster

@H{keys %$_} = () for @$AoH;

DEMO:

use v5.12; use warnings; use Test::More; use Benchmark qw/cmpthese/; my $AoH; for my $n_rec (1, 10,100,1000) { say; say "=== num of records is: ",$n_rec; $AoH = create_data(1,$n_rec); is_deeply( [sort &list_join], [sort &slice_join], ); cmpthese(-1, { 'list_join' => \&list_join, 'slice_join' => \&slice_join, } ); } done_testing; sub list_join { my %H; %H = (%H,%$_) for @$AoH; return keys %H; } sub slice_join { my %H; @H{keys %$_}=() for @$AoH; return keys %H; } sub create_data { my ( $density,$records ) = @_ ; my @AoH; push @AoH, { map { rand 100 <= $density ? ("$_" => $_) :() } "A".. +"ZZ" } for 1..$records; return \@AoH; }
OUTPUT:
__DATA__ === num of records is: 1 ok 1 Rate list_join slice_join list_join 238532/s -- -65% slice_join 682713/s 186% -- __DATA__ === num of records is: 10 ok 2 Rate list_join slice_join list_join 7819/s -- -93% slice_join 112993/s 1345% -- __DATA__ === num of records is: 100 ok 3 Rate list_join slice_join list_join 82.9/s -- -99% slice_join 8533/s 10195% -- __DATA__ === num of records is: 1000 ok 4 Rate list_join slice_join list_join 3.66/s -- -100% slice_join 1067/s 29072% -- 1..4

Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery

Update

fixed bug in sorted tests

Replies are listed 'Best First'.
Re^3: Writing hashes as records to a CSV file (joining keys with slices)
by Tux (Canon) on Dec 09, 2021 at 14:06 UTC

    I compared it to the two methods I used in my examples, just out of curiousity and to learn. I never used the list_join as that intuitively feels stupid/slow, but the slice_join was also not in my default tool-box, as map feels so much more logical to me. Anyway, here we go ...

    I think the grep_map is a neat contender and reads easier than the slice_join. YMMV.


    Enjoy, Have FUN! H.Merijn
      And for the spirit of TIMTOWTDI:
      sub map_direct { my %H = (map %$_, @$AoH); return keys %H } sub map_keys { my %H; @H{map keys %$_, @$AoH} = (); return keys %H }

      map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
      > I never used the list_join as that intuitively feels stupid/slow,

      Well, it's the canonic way to join two hashes.

      > I think the grep_map is a neat contender and reads easier than the slice_join. YMMV.

      I'm very skeptical about solutions with long intermediate lists like with the map. They might cause memory problems.

      see what happens with a density of 75%, and now imagine handling much more data, where the machine starts swapping.

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery

      update

      fixed bugs in tests with sort, see also sort AoH buggy?