Bill,
The fact that Result1 is a sigma proves that the shell script did not encode the \xe4
I disagree, because the `` has some implicit translations going on that you aren't controlling.
To test it, you need to capture the raw output bytes of the -e under test. Then you can compare them to the expected values.
I show an example test where I print \xe4\xe0 twice: the first time, without a binmode in the oneliner; the second time, with a binmode in the oneliner. You can see that the bytes that are output are different. You can test that those bytes match your expectations.
C:\Users\peter.jones\Downloads\TempData\perl>chcp
Active code page: 437
C:\Users\peter.jones\Downloads\TempData\perl>perl pm.pl
__SOURCE__
#!perl
use 5.012; # strict, //
use warnings;
use IPC::Open2;
use Test::More;
undef $\;
print "\n__SOURCE__\n";
seek \*DATA, 0, 0;
print for <DATA>;
$\ = "\n";
{
my $pid = open2(my $ofh, my $ifh, 'perl', '-e', q("print qq(\xe4\xe0)"));
binmode $ofh, ':raw'; # need to read from the open2 output file handle in raw mode, so you're looking at bytes, _not_ characters
chomp(my $line = <$ofh>);
print "without binmode, the high 8-bit characters pass through untranslated: ", unpack 'H*', $line;
is $line, "\xE4\xE0", 'the bytes should be unedited';
print "and printed out during test script: '$line'";
}
{
my $pid = open2(my $ofh, my $ifh, 'perl', '-e', q("binmode STDOUT, ':encoding(Cp437)'; print qq(\xe4\xe0)"));
binmode $ofh, ':raw'; # need to read from the open2 output file handle in raw mode, so you're looking at bytes, _not_ characters
chomp(my $line = <$ofh>);
print "with binmode, xE4 gets translated to x84 for a-umlaut, and xE0 gets translated to x85 for a-grave: ", unpack 'H*', $line;
is $line, "\x84\x85", 'the bytes should be CP437-encoded';
# so here, instead of printing the hexdump of the captured line, you could compare
print "and printed out during test script: '$line'";
}
done_testing();
__END__
__OUTPUT__
without binmode, the high 8-bit characters pass through untranslated: e4e0
ok 1 - the bytes should be unedited
and printed out during test script: 'Σα'
with binmode, xE4 gets translated to x84 for a-umlaut, and xE0 gets translated to x85 for a-grave: 8485
ok 2 - the bytes should be CP437-encoded
and printed out during test script: 'да'
1..2
| [reply] [d/l] |
Thank you pryrt. Your second example is exactly what I asked for in my first post. I have already extended it to test newlines by appending \n to the input string and \r\n to the expected output string (and removing chomp). My project started as a minor annoyance and ended with a one-line solution. I never dreamed that in between, I would need to learn details of windows (including a reference to an old DOS manual to get started), Unicode, and even perl (I have never used a child process).
| [reply] |