Frankly I didn't have the motivation to reply earlier, since I certainly cannot cope with you. (My degree in mathematics isn't enough, one in English literature seems necessary)

You said "a lack of formalism" from your side was evident, unfortunately O(1) is a very well defined formula, so why do you even introduce it?

Did you notice that in 20 lines of the OP's sample 5 entries were already skipped? So a 25% "gap-factor" are the Op's lower bound.

With 1.5 GB of data this "best case" results in about 35 Million entries with roughly 9 Million skipped. And then your proposing an algorithm which goes linear after the first step to be O(1)...?

Your now arguing with "realistic" examples of equally distributed gaps ...! For "real world conditions" better try something like Gaussian distribution.

Then you claim my worst case scenario isn't realistic ... (actually I have to admit it's not even near the worst case). But look, a worst case is just the extremal point of a much bigger set of very bad cases.

Just try to add ONE single line starting with 0 to the OP's sample to get exactly one of these very bad cases. And now tell me about the improbability of such data at the beginning of a file... a case were your alleged "fixed number of operations algorithm" - thats what O(1) means - will need an eternity to succeed in a 1.5 GB File.

That's why worst case scenarios are investigated, with them algos are stable to any kind of unforeseen conditions. And worst cases are much easier to calculate than "average cases". (And for sure more reliable than speed sensors of airbuses over the middle of the Atlantic, where surely engineers covered "all probable cases".)

Just read wikipedia and the references listed, Binary search is as well stable in average and worst cases. And Indexing methods can be much faster with just some space complexity added.

PLEASE NOTE: I'm NOT saying that (a recursively till the end executed) Interpolation Search is a bad approach, I myself proposed "calculate the right lines" hours before you even posted! It's an obvious and good idea, in well known cases.

But, sorry with all respect, stubbornly defending that it's O(1) is a mixture of hubris and nonsense, which makes me speechless *.

Whats next? Why don't you go to a conference of physicists and claim you discovered cold fusion of hydrogen! (It's simple, you just added oxygen and heated it up ...) And when people politely object, you'll insist that their definition of "cold fusion" is too formalistic and just not practical in real live!

Prove you're right and publish it and I'll happily apologize to you.

Cheers Rolf

(*) ... from now on, I promise!


In reply to Re^6: Rapid text searches ( O(log n) time) by LanX
in thread Rapid text searches by joomanji

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.