note
Mabooka-Mabooka
Gentlemen,
<p>1st of all - huge thank you to everybody. It’s as always a pleasure to be here and have your questions answered. I *love* this community of wise, intelligent and professional hackers:-), (-- although I am not too much into Perl:-)- (sorry))..
</p>
<p>A few general remarks:
<br>
1. I am sorry if my initial post caused a bit of frustration for some of you. It wasn’t intended to be. My point was:
<br> I expect a higher quality of answers in FAQ than in general posts. And the more F is a Q, the higher quality the A (I assume) should be. That’s all.
<br> <i> It’s just my opinion of course, but I think many people treat “FAQ” as the only source of truth...</i>
</p>
<p>
2. Thanks a lot for the two _binary search_ answers. Both seem to work, I’ll do some corner-case and performance testing and compare them to hash implementation as well, -- and probably post the results here (if nobody minds).
<br> 2-a). I’ve made two changes in <b> find_int_in_array </b>:
<br>A)
<code>
#my ( $arref, $targ ) = @_; # args are array ref, int value
my ( $targ, $arref ) = @_; # args are: (int, array ref)
</code>
this is just a “style” thing, easier to remember (int in arr, not arr in int), plus - easier to test together with other impl-s by the same driver;
<br>B)
<br>Changed
<code>
my $nextidx = $asize / 2;
my $nextinc = $nextidx / 2;
</code>
<br>to:
<code>
my $nextidx = int($asize / 2);
my $nextinc = int($nextidx / 2);
</code>
(Otherwise, for an array of 5 elements, it’d return an index = 2.5 ;-...)).
</p>
<p>
3. Re: using <code> v.s <pre>:
<br>I do know the rules and tried not to, but,,, on my screen, it’s either <code> is too small (cannot read), or with the font size increase, everything else’s too big and bold..... Maybe smb. can review this policy? (Minor thing of course).
</p>
<p>
4. ($#array + 1) vs. scalar(@array) :
<br>You’re going to laugh, but I did a performance test..:). The results is: $#array is ~10-15% faster than the other one. Intuitively, that’s what one would expect (my wild guess is that $#array is probably an internal counter that is always kept up-to-date, and the scalar(@array) gets calculated).
<br> ....This is a pure academical question of course:-)... It’s hard to imagine an application that would suffer from using the 2nd approach.
</p>
<p>
5. Using a hash vs. Binary search.
<br>
Again, hash works just fine, except for three implications:
<ul>
<li> memory; </li>
<li> slows down for huge amounts of data (vs. bisearch which is always linear);</li>
<li> <b><i> complex:-)</i></b></li>
</ul>
<br>Enough said about ## 1 & 2; let me explain the 3d one:
<br>... Imagine an application that updates an array rarely and looks up often. Say you have 10,000,000 customers / books / whatever that get added / published only once a minute, but information is requested 1000 per second.
<br>So, re-building the hash for lookup 1000 times a second is clearly not an option. Right?
<br> A solution to that might be (I’ll think in some pseudo-language now):
<code>
class Array_with_Hash:
@the_array = ()
% the_hash = ();
init(@a)
{
put (a =>> the_array);
recreate_hash (the_array, the_hash);
}
pop(elem):
{
array.pop(el)
update_hash(el);
}
...
</code>
</p>
Well, if you know for sure that push and pop is *all* that you do, that might be possible. However, creating a generic class like this might be really tricky.
<ul>
<li> array must be locked (no access to it directly, -- only by this class’s methods;</li>
<li> hash values probably have to be arrays_of_indexes... </li>
<li> implementing sort(), slice-and-dice etc. might take some time; </li>
<li> it’s quite easy to “forget” about something...</li>
<li>...</li>
</ul>
<br>This only means that the hash approach only works if <ul>
<li> cost(recreating the hash every time) == nothing</li>
<li>OR </li>
<li> cost(programming time) == nothing..:)</li>.
</ul>
In other words, -- thanks again for the binary search in Perl! (And it should be in FAQ, too...:-)...
488776
488776