These are TrueColor. I am dealing with 281 TRILLION, full 16 bits/channel, 48 bits/pixel.
"Found 81814 colors in 131072000 pixels" -> 1602 Pixels per color.
The 216MB Photoshop RAW/16 file had 27 MILLION unique colors out of 36M "Pixels=36152321, unique Colors=27546248=76.19%"
76% of the pixels have unique colors! This makes your hashing algorithm rehash everything when it lands on a dup.
I am monkeying with the MAX_UNSORTED parameter which determines when a sort has to be done after so many new, random colors have been piled on top of the lookup table.
I had it set at a way, way too low 200. I wrote a Perl script to run the C program with varying MAX_UNSORT numbers and are seeing vastly better performance with 3805 is the best so far. The linear searches on top of the pile are pretty cheap compared to QSorting and merging.
With a 1 in 3 sampling (12M of 36M), I have it down to < 46 seconds with 88.55% unique colors
The larger the number of unique colors, the more it pays to leave a pile of unsorted colors on top.
The one I did before was a ColorMAtch colorspace and it had ~76% unique colors. This one is ProPhoto and is over 85%! Same NEF file, same ACR settings, no photoshop other than to import from ACR and save as RAW.
It looks like I need to work on the Sort_Merge. QSort on the entire 27 million tall stack, 99% already sorted was taking 98% of the program time. The shuffle_merge is 100 times faster on this problem
In reply to Re^2: Perl Hashes in C?
by BrianP
in thread Perl Hashes in C?
by BrianP
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |