Re: Properly testing self-compiled character-encodings

Yes, that would be my approach as well (and I should add those cases to Encode::DIN66003. Take a set of strings and their known, manually verified encoding, and test that your module still encodes them properly:

use Test::More;

use Encode 'encode', 'decode';

my @tests = (
    { known => "Hello World", bytes_1141 => "Hello World" },
    { known => "\N{LATIN CAPITAL LETTER A WITH DIAERESIS}", bytes_1141
+ => "{" }, # or whatever
    { known => "\N{LATIN CAPITAL LETTER U WITH DIAERESIS}", bytes_1141
+ => "}" }, # or whatever
);

plan 3*@tests;

for my $test (@tests) {
    my( $name ) = $test->{name} || $test->{known};
    is encode( 'CP1141', $test->{known} ), $test->{bytes_1141}, "Encod
+ing for '$name'" );
    is decode( encode( 'CP1141', $test->{known} ), $test->{known}, "Ro
+undtrip for '$name'" );
    is decode( 'CP1141', $test->{bytes_1141}), $test->{known}, "Decodi
+ng for '$name'" );
};

done_testing;
[download]

Some of the test cases won't roundtrip cleanly, but you should likely also test for unknown characters like the Euro sign or curly braces.

Update: Fixed module name, as spotted by choroba.

Comment on Re: Properly testing self-compiled character-encodings Download Code