mongre345 has asked for the wisdom of the Perl Monks concerning the following question:

I have come across a problem with incompatibility between character and byte semantics. Specifically my program is mixing XML config files via XML::Simple and remote SSH connections via Net::SSH::Perl. I am using perl 5.6.0 on Linux RH7.1.

Basically the problem manifested itself when I tried to use my XML derived configuration data (host, loginid) with the ssh connections supported in Net::SSH::Perl. Apparently Perl was treating my strings with character semantics and this causes strange and inconsistent problems when sending the data to the SSH module. I understand this is due to how SSH does its serialization.

Generally the result was complete failure to authenticate to the server, sometimes it did authenticate, but inconsistently.

My solution was to place the following at the beginning of my script.

use Net::SSH::Perl;
use XML::Simple;
use bytes;

The use bytes pragma disables character semantics within its lexical scope (at least according to the man page).

I may be able to move the use bytes; around to a tighter scope as my script gets more complex.

My questions are, is this the best way to deal with Character v Byte semantics in this situation? Are there any "negative" effects that I should expect by doing this? and if so what can I do to minimize any negative effects?

Thanks for any input.


  • Comment on What is the proper way to deal with Char v. Byte Semantics

Replies are listed 'Best First'.
Re: What is the proper way to deal with Char v. Byte Semantics
by ncw (Friar) on Jan 11, 2002 at 19:43 UTC
    I have to say I had exactly the same problem trying to send UTF8 encoded XML stuff over HTTP (using Frontier::RPC2). Perl and me both got terribly confused over whether any given string was UTF8 or not.

    This ended up being a complete nightmare especially since this code had to work on 5.5 and 5.6. In the end I worked out that something in Frontier::RPC2 was converting my strings to UTF8 behind my back under 5.6 whereas I had to convert them by hand in 5.5.

    I experimented with 'use bytes' too but that didn't cut it either.

    This code snipped might be useful

    $unicode_perl = 1; { local $SIG{__DIE__}; eval "use utf8; 1" or $unicode_perl = 0; }
    Altogether a very unpleasant experience which makes me think that 5.6 isn't really ready for UTF :-( Maybe 5.8 will be better in this regard.