Linux and most *nixes provide an API for tying a task (process or thread) to a specific processor, but most multi-cpu systems do a pretty good job of scheduling without much tweaking. The statement 'load the data into memory...greatly speedup(sic)' is pause for concern however. Unless the data is read-only, it can be very challenging to improve performance through parallel processing.