http://qs1969.pair.com?node_id=489285


in reply to “A meeting at the Liquor-Vodka Factory”, or… same ARRAY questions again?!!

Gentlemen,

1st of all - huge thank you to everybody. It’s as always a pleasure to be here and have your questions answered. I *love* this community of wise, intelligent and professional hackers:-), (-- although I am not too much into Perl:-)- (sorry))..

A few general remarks:
1. I am sorry if my initial post caused a bit of frustration for some of you. It wasn’t intended to be. My point was:
I expect a higher quality of answers in FAQ than in general posts. And the more F is a Q, the higher quality the A (I assume) should be. That’s all.
It’s just my opinion of course, but I think many people treat “FAQ” as the only source of truth...

2. Thanks a lot for the two _binary search_ answers. Both seem to work, I’ll do some corner-case and performance testing and compare them to hash implementation as well, -- and probably post the results here (if nobody minds).
2-a). I’ve made two changes in find_int_in_array :
A)

#my ( $arref, $targ ) = @_; # args are array ref, int value my ( $targ, $arref ) = @_; # args are: (int, array ref)
this is just a “style” thing, easier to remember (int in arr, not arr in int), plus - easier to test together with other impl-s by the same driver;
B)
Changed
my $nextidx = $asize / 2; my $nextinc = $nextidx / 2;

to:
my $nextidx = int($asize / 2); my $nextinc = int($nextidx / 2);
(Otherwise, for an array of 5 elements, it’d return an index = 2.5 ;-...)).

3. Re: using <code> v.s <pre>:
I do know the rules and tried not to, but,,, on my screen, it’s either <code> is too small (cannot read), or with the font size increase, everything else’s too big and bold..... Maybe smb. can review this policy? (Minor thing of course).

4. ($#array + 1) vs. scalar(@array) :
You’re going to laugh, but I did a performance test..:). The results is: $#array is ~10-15% faster than the other one. Intuitively, that’s what one would expect (my wild guess is that $#array is probably an internal counter that is always kept up-to-date, and the scalar(@array) gets calculated).
....This is a pure academical question of course:-)... It’s hard to imagine an application that would suffer from using the 2nd approach.

5. Using a hash vs. Binary search.
Again, hash works just fine, except for three implications:


Enough said about ## 1 & 2; let me explain the 3d one:
... Imagine an application that updates an array rarely and looks up often. Say you have 10,000,000 customers / books / whatever that get added / published only once a minute, but information is requested 1000 per second.
So, re-building the hash for lookup 1000 times a second is clearly not an option. Right?
A solution to that might be (I’ll think in some pseudo-language now):
class Array_with_Hash: @the_array = () % the_hash = (); init(@a) { put (a =>> the_array); recreate_hash (the_array, the_hash); } pop(elem): { array.pop(el) update_hash(el); } ...

Well, if you know for sure that push and pop is *all* that you do, that might be possible. However, creating a generic class like this might be really tricky.
This only means that the hash approach only works if In other words, -- thanks again for the binary search in Perl! (And it should be in FAQ, too...:-)...