in reply to Processing data with lot of math...
I think that it is possible to do this with one loop, and almost no math!
However, I could be missing something, so read, digest, cogitate and draw your own conclusions.
For example, if you have an atom located at point ( 10, 10, 10 ), and the cut-off distance is 1 unit, then any atom that will be paired with this atom will have to be somewhere within an box defined by the points
p( 09, 09, 09 ) p( 09, 11, 09 ) p( 09, 11, 11 ) p( 09, 11, 11 ) p( 09, 11, 11 ) p( 11, 09, 09 ) p( 11, 09, 11 ) p( 11, 11, 11 )
So, instead of storing the x:y:z in separate arrays, or some other complex data structure, create a single array, indexed by the concatentations of x:y:z, (with leading zeros).
$atoms[ 090909 ] = 'name1'; $atoms[ 091109 ] = 'name2'; $atoms[ 091111 ] = 'name3'; $atoms[ 091111 ] = 'name4'; $atoms[ 091111 ] = 'name5'; $atoms[ 110909 ] = 'name6'; $atoms[ 110911 ] = 'name7'; $atoms[ 111111 ] = 'name8';
Once you have created this (sparse) array, it becomes very easy to pick out the limited set of possible pairings for any given point. It requires a final math check to exclude those point lying in the corners of the box, but the number of calculations should be significantly less than the brute force method.
There are some problems with this.
If so, I would use a scaling factor upon each of the three coordinates to bring them to some manageable range of integers, and store the real coordinates along with the name. I would do this as a single string rather than as an array ref, as it costs very little extra memory to store a longer scalar in an array element, but creating thousands of small arrays costs lots.
In which case it makes sense to use a hash to implement a sparse array. The downsides of this is that the hash takes substantially more memory, and you then need to sort the hash keys prior to processing them.
Depending upon how meany thousands of atoms we are talking about, this could well lead you to requiring more memory than is available, or even possible (assuming a 32-bit processor).
At this point the solution becomes to process the files and create a single output file where each record is of the form
xxyyzz name x.xx y.yy z.zz
A single pass to create the file. Then sort the file (using your systems sort utility) and the a single pass to read the sorted output and do the final comparisons and write out the pairs in whatever format you need them.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Re: Processing data with lot of math...
by itub (Priest) on May 12, 2004 at 18:44 UTC | |
by BrowserUk (Patriarch) on May 12, 2004 at 19:24 UTC | |
by itub (Priest) on May 12, 2004 at 19:40 UTC | |
by BrowserUk (Patriarch) on May 12, 2004 at 20:43 UTC | |
|
Re: Re: Processing data with lot of math...
by qhayaal (Beadle) on May 14, 2004 at 06:16 UTC | |
by BrowserUk (Patriarch) on May 14, 2004 at 18:20 UTC |