in reply to [OT] Displaying 4D data in a 2D image.
Okay. I guess I should have supplied some data and some explanation of what I've tried to date. (I didn't want to influence responses or cloud the underlying question with specifics of how and with what tools to do the actual drawing.)
Here is an illustrative, though rather small set of sample data. The keys are HSV (all scaled 0 .. 1 rather than H being 0 .. 360 ). The values are the frequencies of pixels with that HSV value within the image being analysed:
And here is a 2D/3D representation of that dataset. The H,S,V -> X,Y,Z in a 256x256x256 'cube'. The color of the plotted points is the pixel color.
As you can see, the HSV space discriminates the points into two very clear clusters when viewed this way. However, if viewed from different angles, it is possible to discern 5 or even eight clusterings with a few outliers.
If you look at the frequency data, it is possible to also find two very obvious peaks; and 4 or 5 or 7 more, depending where you apply the cutoff. The challenge is to try and a) find that cut-off; b) partition the dataset around those peaks.
I've made no attempt in this image to plot the frequency. The problem with trying to represent the frequency is the range of the values -- from 1 up to 8.7 million. It could be much bigger for larger images.
Even if I used logarithms of the actual frequencies, the human brain is not well adapted to intuiting peaks and troughs of this magnitude from single points of color, let alone if they are logarithmic. Color ramps work well for surfaces where you can visualise darker colors as 'low' and brighter colors as 'high', because of the flow (gradation) between them.
Not so good for this though.
If I draw the 'point/circle/volume' directly proportional to the frequency (eg. 8743064 ), I'd need a screen the size of an aircraft carrier.
Even using a volume -- a sphere will have a radius of 127; a cube will be sized 200x200x200 -- that single point would fill the entire plot posted above.
No matter how you plot it -- even using (say) cubes of the logs of the frequencies; and expanding the XYZ scales by (say) 10 to create more space between the points -- no 3D to 2D plot allows you to see what is going on.
I am the only one who will see these plots, so utility is far more important that prettiness.
I have a bunch of images, some quite large, and I'm looking to quickly visualise those datasets in order to explore possibilities for categorising them. Ie. The plots are just a way to look for how to tackle a problem, not a solution in themselves.
They are throw away steps on the way to appreciating a bigger problem, and as such should not require large amounts of effort to develop.
In the past I've been lucky enough to have access to high-end proprietary software than would allow you to rotate a 3D plots on screen with the mouse or keyboard in real-time, but I don't know of any free tool that allows this. Nor is there any graphical toolkit (for Perl or anything else), that I'm aware of that would allow me to 'knock up' such an application quickly and easily.
The point here is that this visualisation is not the underlying problem I exploring. Just a step along the way to trying to get to grips with understanding a dataset. Whilst the particular sample image appears -- yet to be confirmed -- to have its pixels clustered both by frequency and HSV into a small number of distinct groups; it may be that when I apply this process to other relevant images, no such clustering occurs. So the goal is not to develop an all singing and dancing 4D data visualisation tool. It is to find a way to visualise a few example sets of data, in a few different ways, to see if there is anything there worth exploring.
Update: Here is another view. With the points plotted as circles using log2( freq ), and the image rotated so that V->X, S->Y, H->Z. It is interesting because it highlights the presence of more than two groups when viewed from this angle.
It is also disappointing because whilst it is easy enough to pick out the bigger sploges, could anyone pick out the 8.7 million splog versus the 1.5 million? Or even the 600,000?
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: [OT] Displaying 4D data in a 2D image. (Reply to all respondants. (Thank you!) :)
by Anonymous Monk on Dec 10, 2011 at 04:53 UTC | |
by BrowserUk (Patriarch) on Dec 10, 2011 at 09:44 UTC | |
by Anonymous Monk on Dec 10, 2011 at 10:14 UTC | |
by BrowserUk (Patriarch) on Dec 10, 2011 at 10:21 UTC | |
|
Re^2: [OT] Displaying 4D data in a 2D image. (Reply to all respondants. (Thank you!) :)
by RichardK (Parson) on Dec 10, 2011 at 14:43 UTC |