in reply to Re: Perl Hashes in C?
in thread Perl Hashes in C?

Ed,

Your hard core, direct, laser focused code got the answer spot on the first go in the delightfully brief time of < 37 seconds:

Running C:\bin\bb.pl Thu Aug 13 11:06:26 2015 Found 27645898 colors in 36152320 pixels 0.76% Elapsed time = 36.82 sec

Franklin would have marveled at the design efficiency.

I have been attempting to arrive at the same 6 byte Quanta size for hours using (*!_))#% pack/unpack. UINT16s work perfectly as do UINT64s. How is it possible that nobody has ever thought of a UINT48?

32 bits is too wimpy; 4.3GB is not enough. But 4.3G*4.3G BYTES is clearly OverTheTop! 18446744073709551616 ????

Wouldn't 65536 * 4294967296 Bytes be just about right? Surely 281474976710656 B "is more memory than anybody will ever need"?? 281 TER! Has a ring to it.

A 24 bit processor would be ideally suited for GRAPHICS PROCESSING.

I hate to be a pest (but it does not usually stop me). While I still have a residual amount of hair left, might I ask you if you could point out my obvious comprehensional deficit with UNPACK?

I have all 217MB in memory. 8 byte quanta are too large and 4 byte are too small so I am stuck with 2 byte type "S", uint16_t. The inane method it all I can get to work, BIT shifting and ANDing:

@ushort=unpack("S*", $buf); # < Ushort[108456960] == RIGHT Number! for($ii=0; $ii < scalar @ushort; $ii+=3) { ($rr, $gg, $bb) = @ushort[$ii .. $ii+2]; # Array slice MORONIC> $bs = $rr | ($gg << 16) | ($bb << 32); # << WORKS ;( $rgb2c{$bs}++; # Increment count for this color }

This works, but as another Monk pointed out, finely slicing and dicing then bit shifting the diminutive chunks and ORing them back together is hardly as satisfying as using 6 byte, native Quanta.

I usually need the individual colors so I need this type of code, just not here. This is a case of wanting to find my error rather than fixing a blocking bug.

How hard can it be to unpack 3 of the units from my array and smash them into the $RGB key I need with NO MONKEY BUSINESS? I tried every permutation of type S I could think of. Type Q worked fine except that it gave 1 1/3 pixel at a time. Is there a way to Unpack 3 UINT16s at a time with UNPACK()??

WORKS!> @q =unpack("Q*", $buf); $sq = scalar(@q) || -1; FAIL! @uint48=unpack("S3", $buf); $s48 = scalar(@uint48) || -1; FAIL! @uint48=unpack("S@3", $buf); $s48 = scalar(@uint48) || -1; FAIL! @uint48=unpack("S[3]", $buf); $s48 = scalar(@uint48) || -1 +; FAIL! @uint48=unpack("(SSS)*", $buf); $s48 = scalar(@uint48) || +-1;
And other, Quixotic attempts at 48 BITness!

If you can't UNPACK 'em, PACK THEM!

I tried packing 3 shorts, a quad with a pair of NULLs chasing and many other schemes:

#$quad=pack('Q', $rr, $gg, $bb, 0x0000); #$q2=pack('Q1', $rr, $gg, $bb, 0x0000); # Q2=0x0000000000000000 #$q4=pack('S4', $rr, $gg, $bb, 0x0000); # #$q5=pack("SSSS", $rr, $gg, $bb, 0x0000); # #$q3=pack('Q*', $rr, $gg, $bb, 0x0000); # Q3=0x0000000000000000 #$q4=pack("Q", $rr, $gg, $bb, 0x0000); # Q4=0x0000000000000000 #$q5=pack("S*", $rr, $gg, $bb, 0x0000); # Q5=0x0000000000000000 #$q5=pack("Q*", @ushort[$ii .. $ii+2]);

I always got zero or some error or something unprintable.

Obviously reading a buffer-full and carving 6 byte slices works. And, reading 3 uint16s and clumsily bit-stitching them together gets the job done. But reading the whole file and unpacking an entire array of finished products in 1 line would be the most elegant and likely the fastest.

Where is DATATYPE "G"?

@UINT48=unpack("G*", $buf); # NATIVE, 48BIT UNSIGNED GRAPHIC INTS!

It is unlikely that either K or R had digital cameras offering RAW file output so they can be forgiven for overlooking the obvious utility of UINT48.

Perhaps what K&R missed the Wall Gank can substantiate?

Thank you, Brian

Replies are listed 'Best First'.
Re^3: Perl Hashes in C?
by flexvault (Monsignor) on Aug 13, 2015 at 19:02 UTC

    BrianP,

      32 bits is too wimpy; 4.3GB is not enough. But 4.3G*4.3G BYTES is clearly OverTheTop! 18446744073709551616 ????
    You have to go back to the math. 8bit or 128bit machines can get the same answer, it's knowing how the bits need to be put together :-)

      Wouldn't 65536 * 4294967296 Bytes be just about right?
    For you: Yes, for me, 32bits are fine for 98% of my work. All of my servers have at least 16GB, and many have many times that amount. But I can use 32bit Perl for 98% of the work (smaller footprint), and 64 bit Perl for the rest. I also have 32bit Perl with 64bit Integers.

    You are used to working with decimal numbers, but pack/unpack can be used to convert between binary, octal, decimal and hexadecimal. To use 48bit RGB, just think of the 6 octets as 3 16 bit numbers. Then this works:

    my $myNum = 65000 * (2**32); my ( $R,$G,$B ) = unpack("nnn", $myNum ); print "\$myNum: $myNum\n(\$R,\$G,\$B): $R\t$G\t$B\n";
    A lot of monks here are better at the math than I, but I can hold my own most of the time!

    For the future, ask specific questions that show your problem and when possible show the code that's demonstrating the problem. For your initial problem, you didn't have to worry about endianess, but you may have to consider it if your working with different architectures.

    Good Luck...Ed

    Regards...Ed

    "Well done is better than well said." - Benjamin Franklin

      Ed,

      This is the part I had working. Extracting uint16 is easy

      >> ( $R,$G,$B ) = unpack("nnn", $myNum );
      My existing code:
      @ushort=unpack("S*", $buf); # Extract oodles of UINT16s
      What I can't figure out is:
      @UINT48 = unpack("???????", $BUF)
      where each UINT48 is 48bits, 6 bytes, 1 contiguous chunk 75% as large as a (long long), 150% as long as a long, 4 byte 32 bit integer, the same size as the quanta I need.

      I need 1 contiguous 6byte REDGREENBLUE

      You are calling them "n"

      >> n An unsigned short (16-bit) in "network" (big-endian) order. I use 'S' <c>>>S An unsigned short value.<<
      I have verified that the S values agree with Photoshop color picker.

      And, I broke your masterpiece tinkering with the buffer size. I was trying various sizes from 4k to 32M to find the optimal size for my large 4TB hard drives and RAID. Interesting timing results, but some wrong answers without a 4096 pixel buffer! Dang!

      With a 500+ MB/sec SSD, there is no need to buffer. With spinning drives, I usually like to grab a cache-full then process while the drive does another read-ahead. I usually work on my SSD anyway.

      Sysread must always return exactly the same byte count for the same file or the sky is falling. The size of the chunks has nothing to do with it. I screwed up something else. I may just leave it at 4096 * 6 and be done with it. It's already darn fast.

      I think I like it as-is!

      This UINT48 is going to bug me until I figure it out. I may have to hack it into Perl myself!

      Thank you, Brian