Re: Building binary strings.

They all will do the same thing, so choose the method that you find the most appealing.

However, since they represent the UTF-8 encoding of some characters, consider working with code-points like this:

use Encode;
...
$x = chr(65533);
$y = Encode::encode('utf-8', $x); # -> "\x{aa}\x{42}\x{fe}"
[download]

When dealing with perl strings, it is helpful to keep in mind the following:

1. perl strings are just an array of numbers, and the numbers (characters) can be interpreted as either a Unicode code-points or as byte values

2. if the characters (numbers) in a string are meant to be interpreted as code-points, we call it "text" and if they are meant to be interpreted as byte values we call the string "binary data".

The point is that the string "\x{aa}\x{42}\x{fe}" can be interpreted as either three Unicode code-points (U+00AA, U+0042, U+00FE) or as three bytes (0xaa, 0x42, 0xfe), and only the programmer knows what the correct interpretation is.

Here are some examples of the difference. If a string (say $x) is meant to contain code-points, then the following usage of $x is logically incorrect even if perl does not report an error:

$y = Encode::decode('some encoding', $x);
binmode STDOUT, ":bytes"; print $x;
...
[download]

Conversely, if $x contains byte values, the following are incorrect uses of $x:

$y = Encode::encode('some encoding', $x);
$n = rindex($x, "\N{WHITE SMILEY FACE}"); # need: use charnames ':full
+';
...
[download]

In these cases, perl may return a result, but the result is meaningless.

Hope this helps. Or better yet, hope this generates some more questions :-)

Comment on Re: Building binary strings. Select or Download Code