I have a small Perl script that takes a crop spec with the XY position, width and height of a box, grabs the 48 bit RGB as a hash key and increments the data. It takes just a second on a ~100x100 pixel rectangle.
I use it to measure the quality of color spaces and various photo processes.
I wrote something much more clunky in C for use on entire, 216 MB pictures. But it takes 2.17 HOURS to run on a D800E pic with 27 Million unique colors. The lookup table made up of 64bit long longs gets to be 27E6 long toward the end and takes ~25 bsearch comparisons per lookup.
What I need is a good algorithm to take a 48 bit integer with quantomly random values and spit out an integer in the range 0-36E6. Perl has some pretty slick Hashing capabilities. I wonder whether any of them could be leveraged here.
GORY DETAILS: ----------
I have a C program which uses an array of 36 Million uint64s to hold 48 bits of RGB color data (16bits/channel) and 16 bits for a counter for the distinct colors found when scanning a 216 MB .RAW file.
For extremely colorful pictures, the length of this lookup table gets as high as 27 Million items. Even with a bsearch, it takes over 2 hours to count the number of pixels sharing each distinct color.
How does one HASH 48 bits of ~random data RGB data and return an integer in the range 36E6 to as high as many GB with 32GB of system RAM?
CURRENT PROCEDURE: -------------------------------
BSearch the known colors for the new color and increment the 16 bit counter if found. Else, append the new, unique colors to the end of the array and periodically QSort the new ones. Then I do a shuffle merge on the millions of old, sorted colors and the hundreds of new, colors (after qsorting the new) appending as many as possible from each array onto a merge array and go back and forth until old and new are all merged.
QSorting an array with millions of sorted and a few hundred unsorted seems like a waste. If bsearch fails on the sorted bottom, I do a linear search on the unsorted top and append the color if not found.
The BSearch comparison function masks off the highest 16 bits (0x0000FFFFFFFFFFFF) of *A and *B and returns 1 if *A & MASK > *B & MASK, -1 if <, else 0. Can't think of a faster way to compare 2 long longs from pointers.
What I need is a Perl Hash (just do it) and C speed; Cerl Language? tags hash raw rgb 48bit
In reply to Perl Hashes in C? by BrianP
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |