Yes, that would be my approach as well (and I should add those cases to Encode::DIN66003. Take a set of strings and their known, manually verified encoding, and test that your module still encodes them properly:
use Test::More; use Encode 'encode', 'decode'; my @tests = ( { known => "Hello World", bytes_1141 => "Hello World" }, { known => "\N{LATIN CAPITAL LETTER A WITH DIAERESIS}", bytes_1141 + => "{" }, # or whatever { known => "\N{LATIN CAPITAL LETTER U WITH DIAERESIS}", bytes_1141 + => "}" }, # or whatever ); plan 3*@tests; for my $test (@tests) { my( $name ) = $test->{name} || $test->{known}; is encode( 'CP1141', $test->{known} ), $test->{bytes_1141}, "Encod +ing for '$name'" ); is decode( encode( 'CP1141', $test->{known} ), $test->{known}, "Ro +undtrip for '$name'" ); is decode( 'CP1141', $test->{bytes_1141}), $test->{known}, "Decodi +ng for '$name'" ); }; done_testing;
Some of the test cases won't roundtrip cleanly, but you should likely also test for unknown characters like the Euro sign or curly braces.
Update: Fixed module name, as spotted by choroba.
In reply to Re: Properly testing self-compiled character-encodings
by Corion
in thread Properly testing self-compiled character-encodings
by yulivee07
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |