Actually I never questioned your code just your headlines and assertions...
And what is it about my responses:
The headline claim for O(1) time is based on comparison with Perl's hashes which also do a linear search down a bucket chain to resolve hash clashes, but are generally rated as O(1).
It was a headline, not a formal thesis. The absence of any further proof or discussion, combined with the "crude demo code", should be a strong indication of the lack of formalism.
that does not satisfy you that the mention of O(1) was never more than a shorthand for saying "quickly and efficiently"--just "a headline, not a formal thesis."? It's purpose was soley to indicate that there are alternatives to using a database or a linear search from the top of the file.
Your own links show that--when coded properly, not just "crude demo code";--a dictionary sort can even outstrip a binary search O(log log N) versus O(log N).
For 38e6 (estimate) records, that's
c:\test>perl -E"say log 38e6"
17.4530967176907
c:\test>perl -E"say log log 38e6"
2.8595170952357
Sufficient potential improvement to warrent bringing that potential to the attention of the OP. And sufficiently closer to O(1) than O(N) to warrent the informal use of the former in preference to the latter in characterising the algorithm.
Is it possible to improve the crude implementation I posted to render a better than O(log N) efficiency? Unless the OP responds and gives us access to one or more realistic copies of his data--we'll never know for sure. Which is just repeating everything I said in the last three paras of 769241.
And that, as the saying goes, is all she wrote. In the absence of any further feedback from the OP about the true nature of his data, further speculation serves no other purpose than a pissing contest.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
|