So, does your code run now? If it runs within whatever time constraint that you have (and it sounds like even overnight is "fast enough"), why bother with more optimization? I mean if it takes you a day or two of work and the program would have already finished by then, its not a good use of your time. However a 2D array would probably be better that a 2D Hash.
Update: Is there a way to provide a small, runnable data set? Even a 20x20 matrix or whatever? Ie. some data set that is small enough to post here?