Re: hexadecimal division

I think it's best to clarify a few things - you may already know some or all of this, but since this kind of question comes up once in a while, I'll take this opportunity to write a little more on it.

First, hexadecimal is just another representation format for the more abstract concept of an ~~integer~~ number - although Perl handles all kinds of numbers pretty transparently, I'll just talk about integers for now. The idea of different representations of the same thing occurs in a lot of places:

0b00101010, 0x2a, 052, and XLII represent the same integer commonly known as 42 in decimal; note that in C (not Perl!) one can even write that integer as '*',
the byte sequences EF BB BF, FE FF, and 2B 2F 76 38 represent the same Unicode character (the BOM),
42°C, 315.15°K, and 107.6°F represent the same temperature, and
"2009-02-13 13:31:30 HST" and "2009-02-14 10:31:30 AEDT" represent the same second in time.

In any of these examples, if one is getting inputs in different formats, it's usually easiest to first convert them to a single internal representation, do the work, and then re-convert them to the desired format on output, instead of trying to mix different formats in one program.

Second, I've noticed that sometimes the term "hexadecimal" is used in place of "binary", probably because binary files are viewed with "hex editors", and also that there is sometimes confusion between a string containing a hexadecimal representation of some bytes and the bytes themselves.

Let's say a file contains four bytes whose numeric values in decimal are 80, 101, 114, and 108. This sequence of bytes interpreted as ASCII characters is the string "Perl". If I run a tool to get a hex dump of that file, I will see something like "5065726c", that is, each byte is being represented as a two-digit hexadecimal number. However, if in my code I have a string that looks like "5065726c", this is a sequence of eight characters and, ignoring Unicode for now, also a sequence of eight bytes (decimal values 53, 48, 54, 53, etc.). If in my program I instead want to work with the four bytes (8-bit integers) which the string "5065726c" represents, I need to convert it first. In Perl, the usual tools for all these kinds of conversions are pack, unpack, ord, chr, hex, oct, and sprintf.

In Perl, the string "Perl" can also be written as "\x50\x65\x72\x6C", but regarding your code note that "\x10" is not the same as 0x10 - the latter is simply another way of writing the number 16, while the former is a string with one character, that character having a decimal value of 16 (the ASCII control character DLE / "data link escape"). If one has had exposure to a language like C or C++, one might be used to the equivalence (again ignoring Unicode) of strings and char arrays, and therefore being able to initialize a "string" with code like {0x50,0x65,0x72,0x6C,0}. But in Perl, strings can't directly be treated as arrays of integers - although this being Perl, of course there's a module for that ;-) (but please don't use it in real code). Another point: although Perl converts between the two transparently in most cases, 16 and "16" are still two different things, the first an integer and the second a two-character string.

When it comes to the operations ~ | & ^, in Perl one has to be aware of how the Bitwise String Operators work, so it is important to be aware of whether one is working with strings or integers. 0x2A ^ 0x7A and "\x2A" ^ "\x7A" are essentially the same operation (00101010 xor 01111010 = 01010000), but in the first, the inputs and output are integers, while in the latter they are strings.

While it's possible to implement division in other domains, it's of course easiest with numbers, and we don't even really have to care what internal format the computer is using to store those numbers. So at the very latest, the division is when you'll need to convert. But in this case, I'd personally probably work with integers all the way through. (By the way, dividing an integer by a power of two is the same as a bit shift, that is, int($x/16) is the same as $x>>4.)

Lastly, Data::Dumper with Useqq turned on (or Data::Dump) will help you inspect the variables at each step. However, note that both of these modules will sometimes display a string like "16" as the number 16 - if you want to know exactly how Perl is storing your value internally, use Devel::Peek.

It is still a bit unclear to me if you need the output as an integer or string. In your example, the output as an integer would be 0 (aka 0x00), which Perl will transparently convert to the string "0", which is a one-character string, that character being the ASCII character 0x30 - which would probably explain the result you are currently getting. If instead you need the output as a binary value stored in a string, you'll have to convert the integer 0 to the string "\0" (aka "\x00") using e.g. pack, or, since it's a single character, chr.

use warnings;
use strict;
use List::Util qw/reduce/;
use Data::Dumper;
$Data::Dumper::Useqq=1;
$Data::Dumper::Indent=0; $Data::Dumper::Terse=1;

my $string = "a202005";
print Dumper($string),"\n";              # "a202005"
my @chars = map ord, split //, $string;  # or: unpack 'C*', $string
print Dumper(\@chars),"\n";              # [97,50,48,50,48,48,53]
my $xor = reduce { $a ^ $b } @chars;
print Dumper($xor),"\n";                 # 100
my $chr = chr $xor;                      # or: pack 'C', $xor
print Dumper($chr),"\n";                 # "d"
my $out = $xor>>4;
print Dumper($out),"\n";                 # 6
print Dumper(chr $out),"\n";             # "\6"
[download]

Updated as per replies. Also very minor edits for clarity.

Comment on Re: hexadecimal division Select or Download Code

Replies are listed 'Best First'.
Re^2: hexadecimal division (hexadecimal floating point -updated) by LanX (Saint) on Dec 16, 2017 at 20:44 UTC
> hexadecimal is just another representation format for the more abstract concept of an integer minor nitpick, to my surprise it's now possible to define hexadecimal floats. see perldata#Scalar-value-constructors `0x1.999ap-4 # hexadecimal floating point (the "p" is required)` couldn't test yet, seems to be newer than 5.16 :) update yep `DB<1> p $] 5.024001 DB<2> p 0x1.999Ap-4 0.100000381469727` [download] Though not mentioned in `perlnumber` yet update seems to work, the exponent 4 counts to the power of 2 (NOT 16 or 256). (164 is there to move the fraction point) `DB<37> p 0x1999A /164 /24 0.100000381469727` [download] And it becomes obvious why `p` was used since an `E` is a valid hex digit. more testing `DB<49> eval qq{print 0x1p${_},"\n"} for -4..4 0.0625 0.125 0.25 0.5 1 2 4 8 16` [download] Hex fraction point moves with each 4 power steps (since 16 == 24 ) `DB<58> x 0x10 , 0x1.0p4 , 0x1.0p0 , 0x10p-4 0 16 1 16 2 1 3 1` [download] Cheers Rolf _{(addicted to the Perl Programming Language and ☆☆☆☆ :) Wikisyntax for the Monastery}	[reply] [d/l] [select]
Re^3: hexadecimal division (hexadecimal floating point -updated) by haukex (Archbishop) on Dec 16, 2017 at 21:24 UTC
hexadecimal floats Yep, that's why I said "I'll just talk about integers for now" - but I could have worded it better, thank you for pointing that out! I've updated the node.	[reply]
Re^4: hexadecimal division (hexadecimal floating point) by LanX (Saint) on Dec 16, 2017 at 21:35 UTC
No critic at all - that's a stunning new feature I really love. :) It's rather the fact it's not mentioned in `perlnumber` that drove me to this post. Cheers Rolf _{(addicted to the Perl Programming Language and ☆☆☆☆ :) Wikisyntax for the Monastery}	[reply]
Re^3: hexadecimal division by choroba (Cardinal) on Dec 16, 2017 at 21:07 UTC
Yes, 5.22. ($q=q:Sq=~/;[c](.)(.)/;chr(-\|\|-\|5+lengthSq)`"S\|oS2"`map{chr \|+ord }map{substrSq`S_+\|`\|}3E\|-\|`7**2-3:)=~y+S\|`+$1,++print+eval$q,q,a, [download]	[reply] [d/l]

update

update

more testing