Re: Faster indexing an array

Replies are listed 'Best First'.
Re^2: Faster indexing an array by wollmers (Scribe) on Sep 19, 2014 at 22:09 UTC
Thanks a lot. Davidos solution is faster than mine, and LanX' and CountZeros is the fastest. For some reason Devel::NYTProf gives better results for C-style loops over 'for my $i (...)'. The hashref is a relict from having this part in an sub, then refactored down, then inlined, now 1-lined. See here the same logic in Algorithm::Diff: `sub _withPositionsOfInInterval { my $aCollection = shift; # array ref my $start = shift; my $end = shift; my $keyGen = shift; my %d; my $index; for ( $index = $start ; $index <= $end ; $index++ ) { my $element = $aCollection->[$index]; my $key = &$keyGen( $element, @_ ); if ( exists( $d{$key} ) ) { unshift ( @{ $d{$key} }, $index ); } else { $d{$key} = [$index]; } } return wantarray ? %d : \%d; }` [download] Having an arrayref comes from the two input-parameters, sequences X, Y or sometimes also called A, B in the descriptions of diff/LCS/align-algorithms. Helmut Wollmersdorfer	[reply] [d/l]
Re^3: Faster indexing an array by LanX (Saint) on Sep 19, 2014 at 22:34 UTC
The `if(exists...)` test isn't necessary and can be optimized away! `DB<119> unshift @{ $d{key} }, 'bla' => 1 DB<120> \%d => { key => ["bla"] }` [download] Cheers Rolf _{(addicted to the Perl Programming Language and ☆☆☆☆ :)}	[reply] [d/l] [select]
Re^3: Faster indexing an array by LanX (Saint) on Sep 19, 2014 at 23:17 UTC
you didn't tell us that you need only an interval out of the array, the fastest approach with my version of Perl is iterating over an interval or array slice `DB<173> use Time::HiRes qw/time/ DB<174> @x=();$t=time; for (my $i=$start; $i<=$end;$i++) { push @x, +$a[$i]}; print time-$t 0.21531081199646 DB<175> @y=(); $t=time; push @y, $_ for @a[$start..$end]; print time +-$t 0.1353440284729 DB<176> @z=();$t=time; push @z, $a[$_] for $start..$end; print time- +$t 0.142512083053589` [download] (you are welcome to do proper benchmarking =) but optimizing or inlining `&$keyGen( $element, @_ )` should lead to much more efficiency, since 25% of 15% doesn't count much!!! update of course you could directly iterate over the values of a hash slice of an array slice... :) `DB<180> %h=(); $i=$start; $t=time; push @$_, $i++ for @h{@a[$start.. +$end]}; print time-$t => 1 0.427901983261108` [download] BTW: `@a=0..1e6;$start=1e5;$end=2*$start;` Cheers Rolf _{(addicted to the Perl Programming Language and ☆☆☆☆ :)}	[reply] [d/l] [select]
Re^4: Faster indexing an array by wollmers (Scribe) on Sep 20, 2014 at 06:54 UTC
It's not my code, it's from Algorithm::Diff. Of course, A::Diff could be faster, maybe I take over the module.	[reply]
Re^2: Faster indexing an array by SuicideJunkie (Vicar) on Sep 19, 2014 at 21:33 UTC
I for one, tend to use scalars and refs because that makes using the variables more consistent (arrows for everybody!). It makes for less effort to pass around the refs to helper functions. I'll still use a direct array variable for something very local and/or temporary, but in general it'll be refs. These days, I often go all the way to a `$universal` hashref, which makes it almost trivial to save and restore the program state. PS: Such a `$universal` hashref technique is not recommended for anyone who has issues with spelling key names consistently and accurately.	[reply] [d/l] [select]
Re^3: Faster indexing an array by doom (Deacon) on Feb 17, 2015 at 22:23 UTC
(1) I suggest using %hash and @array for local use, and if you need to pass it around, take a ref to it... I'll often have a routine that creates a %hash and then does a "return \%hash;". The reasoning is that the sigils are something like built-in hungarian notation, and you might as well take advantage of them when you can. (By the way: Conway recommends using a "_ref" suffix on references, but my take is that's either too heavy-handed or doesn't go far enough. If you're going to say "_ref" you might as well specify which it is: "_aref" or "_href". If all you care about is whether it's a reference or a scalar, then use plural or singular names: "$items" or "$item". (2) But the real reason I'm writing is this remark: "These days, I often go all the way to a $universal hashref, which makes it almost trivial to save and restore the program state." (a) 'tis true that hash field names get autovivified and hence defeat "use strict", but if you have a known set of allowed names, you can strictify them yourself, using the "lock_keys" routine from Hash::Util. `my %uni = map{ $_ => undef } @allowed_names; Hash::Util::lock_keys( %hash );` [download] (b) but you're reinventing perl objects. You might as well just do objects... then if you're using something moose-like, like Mouse or Moo, the "allowed names" get turned into a list of "has" lines.	[reply] [d/l]

update