Re^2: Gay Dating; or, Thrifty Set Intersections with an Accumulator

Big O notation is a little woolly all by itself. I think you know this, so I won't drive the point into the ground. For some value of "faster" (where faster is not better), an O(N²)-bounded algorithm grows faster than an O(N)-bounded one. That doesn't ensure the latter is always preferred; it only indicates.

I stated a range to exclude the possibility of better (for some value of "better") algorithms than O(N) (or worse than O(N²)). This is not a sophisticated or advanced node. This is a pretty basic introduction to thinking about how to attack a problem set.

As stated clearly in my FINAL WARNING, the script shown is not any sort of solution to the bioinformatics exercise, here only hinted at. It's a demonstration of technique, only loosely related to the preceding essay. There are only ten simple records; almost anything would work. If we wait, somebody may golf it down to a one-liner.

The difference between my code and your code is that, as the data set gets larger and the problem set more complex, my editing may go a bit more smoothly. I don't doubt that you'll wind up with a really efficient script; you're good. But as the stress piles on, I'll accommodate it; my pieces are flexible and modular (in the wider sense). To put it another way, I would fear to maintain your code, while you would only hate to maintain mine.

Note: Contents may change.

Comment on Re^2: Gay Dating; or, Thrifty Set Intersections with an Accumulator Download Code

Replies are listed 'Best First'.
Re^3: Gay Dating; or, Thrifty Set Intersections with an Accumulator by JavaFan (Canon) on Aug 11, 2010 at 18:41 UTC
For some value of "faster" (where faster is not better), an O(N2)-bounded algorithm grows faster than an O(N)-bounded one. This shows you don't know much about big-Oh. Any algorithm whose running time is in "`O(N)`" (for N towards infinity) is also in "`O(N²)`". And big-Oh notation is not "woolly". In fact, if the running time of an algorithm is expressed as `T(N)`, with N the size of the input, and it's said that `T(N)` is in `O(f(N))`, it means that: ∃ M, c > 0: ∀ m > M: T(m) <= c f(m) The difference between my code and your code is that, as the data set gets larger and the problem set more complex, my editing may go a bit more smoothly. And you base this statement on what? (And what has editing to do with the size of the data set?) To put it another way, I would fear to maintain your code, while you would only hate to maintain mine. I don't know what to think of a programmer that fears to maintain code that consists of two small, self contained blocks, and a small subroutine.	[reply]

Replies are listed 'Best First'.

Re^3: Gay Dating; or, Thrifty Set Intersections with an Accumulator
by JavaFan (Canon) on Aug 11, 2010 at 18:41 UTC

For some value of "faster" (where faster is not better), an O(N2)-bounded algorithm grows faster than an O(N)-bounded one.

Any algorithm whose running time is in "O(N)" (for N towards infinity) is also in "O(N²)".

And big-Oh notation is not "woolly". In fact, if the running time of an algorithm is expressed as T(N), with N the size of the input, and it's said that T(N) is in O(f(N)), it means that:

    ∃ M, c > 0: ∀ m > M: T(m) <= c f(m)

The difference between my code and your code is that, as the data set gets larger and the problem set more complex, my editing may go a bit more smoothly.

To put it another way, I would fear to maintain your code, while you would only hate to maintain mine.

[reply]