Re: Re: Why is variable interpolation suppressed in \x{$xxx} replacement ?

I like the chr() approach; however, I was concerned about getting tangled up in multibyte unicode problems. It *looks* like if I use chr(???) with a ??? <= 255 I will always get the single byte I am looking for (i.e. not translated to/from some type of unicode symbol set). Correct ?
With respect to your second suggestion:
1. Yes, I also had this thought. I was originally trying to run with .../go; (for efficiency) and I was concerned that the captured string would not be inserted without recompilation. I was probably mistaken ... :)
2. The fact that this works implies the the initial \x{$dle} IS interpolated and the replacement one IS NOT. Abigail-II's explanation did not cover why one is interpolated and the other is not. Seems unduly inconsistant - even for perl :)
Thanks to all ! (Is it customary to reply with "thanks" (only), or is that considered unnecessary babble ?) Thanks, Scott.

Comment on Re: Re: Why is variable interpolation suppressed in \x{$xxx} replacement ?

Replies are listed 'Best First'.
Re: Re: Re: Why is variable interpolation suppressed in \x{$xxx} replacement ? by bart (Canon) on Sep 29, 2003 at 18:52 UTC
It looks* like if I use chr(???) with a ??? <= 255 I will always get the single byte I am looking for (i.e. not translated to/from some type of unicode symbol set). Correct ?* Well... in a way... yes. But you're overlooking one thing: if Perl concatenates a UTF8 string with a Latin-1 string (at least, that's the only way to think about it that makes sense), Perl will convert the Latin-1 string to UTF-8. Let me show you with an example: `($\, $,) = ("\n", " "); # set up output mode $string = "A" . chr(180) . "B"; # Latin-1 print unpack "C", $string; $string .= chr(367); # UTF-8 print unpack "C", $string;` [download] Output: 65 180 66 65 194 180 66 197 175 As you can see, the original chr(180), between chr(65) ("A") and chr(66) ("B") is converted to UTF-8, rsulting in two bytes. So, if you want UTF-8, all you have to do is insert the characters into a UTF-8 string, or concatenate it with a UTF-8 string. That may even be a zero-length string, asq returned by `pack "U0"`: `($\, $,) = ("\n", " "); # set up output mode $string = "A" . chr(180) . "B"; # Latin-1 print unpack "C", $string; $string .= pack "U0"; # zero length, UTF-8 print unpack "C", $string;` [download] Result: 65 180 66 65 194 180 66 p.s. This was tested with perl 5.6.1. on Windows. Not that it matters much — it shouldn't, except that you need at least perl 5.6.	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re: Re: Re: Why is variable interpolation suppressed in \x{$xxx} replacement ?
by bart (Canon) on Sep 29, 2003 at 18:52 UTC

It *looks* like if I use chr(???) with a ??? <= 255 I will always get the single byte I am looking for (i.e. not translated to/from some type of unicode symbol set). Correct ?

($\, $,) = ("\n", " "); # set up output mode
$string = "A" . chr(180) . "B";  # Latin-1
print unpack "C*", $string;
$string .= chr(367); # UTF-8
print unpack "C*", $string;
[download]

65 180 66
65 194 180 66 197 175

So, if you want UTF-8, all you have to do is insert the characters into a UTF-8 string, or concatenate it with a UTF-8 string. That may even be a zero-length string, asq returned by pack "U0":

($\, $,) = ("\n", " "); # set up output mode
$string = "A" . chr(180) . "B";  # Latin-1
print unpack "C*", $string;
$string .= pack "U0"; # zero length, UTF-8
print unpack "C*", $string;
[download]

65 180 66
65 194 180 66

p.s. This was tested with perl 5.6.1. on Windows. Not that it matters much — it shouldn't, except that you need at least perl 5.6.

[reply]
[d/l]
[select]