unless you can write code that effectively divides the problem into separate threads that by design of this problem subdivision end up solving it faster; multi-threading is just slicing the same cake (cpu time) a slightly different way, with overheads for maintaining threading.
haven't checked your code..i assume it's as efficient as can be.
the hardest line to type correctly is: stty erase ^H