in reply to Properly testing self-compiled character-encodings
Yes, that would be my approach as well (and I should add those cases to Encode::DIN66003. Take a set of strings and their known, manually verified encoding, and test that your module still encodes them properly:
use Test::More; use Encode 'encode', 'decode'; my @tests = ( { known => "Hello World", bytes_1141 => "Hello World" }, { known => "\N{LATIN CAPITAL LETTER A WITH DIAERESIS}", bytes_1141 + => "{" }, # or whatever { known => "\N{LATIN CAPITAL LETTER U WITH DIAERESIS}", bytes_1141 + => "}" }, # or whatever ); plan 3*@tests; for my $test (@tests) { my( $name ) = $test->{name} || $test->{known}; is encode( 'CP1141', $test->{known} ), $test->{bytes_1141}, "Encod +ing for '$name'" ); is decode( encode( 'CP1141', $test->{known} ), $test->{known}, "Ro +undtrip for '$name'" ); is decode( 'CP1141', $test->{bytes_1141}), $test->{known}, "Decodi +ng for '$name'" ); }; done_testing;
Some of the test cases won't roundtrip cleanly, but you should likely also test for unknown characters like the Euro sign or curly braces.
Update: Fixed module name, as spotted by choroba.
|
|---|