> found bug in benchmark, will correct later. vec still among slowest

as promised:

Integrity of the results are tested in a way that excludes possible sideeffects of benchmark-routines.

Benchmarks are run with different datasets with shrinking densities ("vec"-approach breaks >1e10)

Result: "vec" doesn't scale well for larger values...

Perlversion v5.10.0 Setting values range: 1..1e4 ok 1 - hash_grep ~~ hash_key (3231 entries) ok 2 - hash_key ~~ hash_values (3231 entries) ok 3 - hash_values ~~ using_vec (3231 entries) ok 4 - using_vec ~~ hash_grep (3231 entries) ok 5 - hash_grep ~~ hash_key (3231 entries) ok 6 - hash_key ~~ hash_values (3231 entries) ok 7 - hash_values ~~ using_vec (3231 entries) 1..7 Setting values range: 1..1e4 Rate hash_values hash_key using_vec hash_grep hash_values 29.8/s -- -45% -48% -48% hash_key 53.8/s 81% -- -7% -7% using_vec 57.7/s 93% 7% -- -0% hash_grep 57.7/s 93% 7% 0% -- Setting values range: 1..1e6 Rate hash_values hash_key using_vec hash_grep hash_values 28.5/s -- -24% -24% -25% hash_key 37.3/s 31% -- -0% -2% using_vec 37.3/s 31% 0% -- -2% hash_grep 38.2/s 34% 2% 2% -- Setting values range: 1..1e8 Rate using_vec hash_values hash_grep hash_key using_vec 17.1/s -- -41% -53% -57% hash_values 29.1/s 70% -- -20% -26% hash_grep 36.2/s 112% 24% -- -8% hash_key 39.5/s 131% 36% 9% -- Setting values range: 1..1e9 Rate using_vec hash_values hash_grep hash_key using_vec 3.29/s -- -88% -91% -92% hash_values 28.2/s 759% -- -21% -27% hash_grep 36.0/s 993% 27% -- -7% hash_key 38.8/s 1081% 38% 8% -- Compilation finished at Wed Nov 27 02:08:15
code:
use strict; use warnings; use Benchmark qw/cmpthese/; use Data::Dump qw/pp/; use feature 'say'; use Test::More; say "Perlversion $^V"; my (@arr_1,@arr_2); sub init_data{ my $range_limit=shift; say "\n\n Setting values range: 1..$range_limit\n"; my %unique; $unique{int rand $range_limit}=undef while keys %unique <8000; @arr_1=keys %unique; %unique=(); $unique{int rand $range_limit}=undef while keys %unique <6000; @arr_2=keys %unique; } { package CMP; my @res= ("") x 8000; sub hash_values_diff { my %diff3; @diff3{@arr_1} = @arr_1; delete @diff3{@arr_2}; @res = values %diff3 ; } sub hash_key_diff { my %diff3; undef @diff3{@arr_1}; delete @diff3{@arr_2}; @res = keys %diff3 ; } sub using_vec_diff { my $vec = ''; vec( $vec, $_, 1 ) = 1 for @arr_2; @res = grep !vec( $vec, $_, 1 ), @arr_1; } sub hash_grep_diff { my %arr_2_hash; undef @arr_2_hash{@arr_2}; @res = grep !exists $arr_2_hash{$_}, @arr_1; } } #--- Test benchmarks for correctness init_data("1e4"); test_subs('_diff$',"CMP"); #--- Compare benchmarks foreach my $range (qw/1e4 1e6 1e8 1e9/) { init_data($range); cmpthese(-5, pckg_subs('_diff$',"CMP") ); } #====== Utility functions for smarter benchmarks sub pckg_subs { my $filter = shift // qr[.]; my $pckg_name = shift // "CMP"; my $stash = do { no strict 'refs'; \ %{ "${pckg_name}::" }; }; # filter all subs from package my $codehashref; while (my ($name,$glob)= each %$stash) { if ( my $cref = *{$glob}{CODE} and $name =~ s/$filter// ) { # print "$name:\t$glob\n"; $codehashref->{$name}=$cref; } } return $codehashref; } sub test_subs { my $h_subs=pckg_subs(@_); my ($last_name,@last_res,$name,$code,@res); # compair pairwise, loop twice to exclude sideeffects for (1..2) { while (($name,$code) =each %$h_subs ) { @res = sort $code->(); is_deeply(\@last_res,\@res,"$last_name ~~ $name (". scalar @res +. " entries)") if $last_name; @last_res = @res; $last_name = $name; } } done_testing( ); }

Cheers Rolf

( addicted to the Perl Programming Language)


In reply to Re^5: Best method to diff very large array efficiently by LanX
in thread Best method to diff very large array efficiently by newbieperlperson

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.