I will be quick cuz it is 4am here. I was referring to
running time for the big-O complexity.
Space complexity is a different measure but even if we do count it, the complexity of the algorithms is still
O(m+n).
This is because the
key idea in the algorithms is that we traverse the 1st array, create a hash and then test the hash against the 2nd array, which for arrays with
m and
n elements is p*m+q*n, p,q constants, or better O(m+n). If we were
not to use hashes the running time would be O(m*n) because we would have 2 nested loops. Also,the space allocations are not ~2^n but ~logN which is ignored in the O(m+n) notation.
I will
strongly disagree with your statement about big-O notation being done only in academic papers. The notation does not tell you anything about the
actual running time of the algorithm, but it is a hint on how the complexity
scales with the input. Real benchmarks are most important but they do not replace mathematical analysis but supplement it. In simple terms O(m*n) indicates 2 nested loops, while O(m+n) indicates 2 independent loops. That's it. There is no reference to actual time.
Complexity theory is a real treasure when we can apply it properly. Actually, your method is a variation of a
Bloom Filter which is known for being very fast. To overcome the O(m+n) a special structure like a
rooted tree(heap) is needed, because there is no better solution.
This paper (PDF) is one of he many proofs that the union problem is linear on the input.
Finally the new version
buk2() may be faster but
drops a key advantage of the previous, which is the ability to handle duplicate array elements. Still your algorithm is the
killer one and I haven't think of anything better.
That's all for now. I am going for some sleep and come back tomorrow, hopefully with some code on the problem. Till then I will use big-O notation to count the sheep!
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.