Well, I finally got around to the troublesome matter of the testing script for the module, and found I was forced to use one of the tools for testing...so, Test::More::UTF8 it was.

...until it wasn't.

Something is different in the coding for the UTF8, it appears, and the script failed to run. Rather than more hours trying to troubleshoot what is unfamiliar to me, I've abandoned the /t folder for testing and gone the other route of having my own test.pl script in the main folder for the module. In that I use "Test::Simple" for a few simple tests, then run my own tests. It seems to work okay, but I wish it were better.

#!/usr/bin/perl

use strict;
use warnings;
use 5.008;
use utf8;
use FindBin qw($Bin);
use lib "$Bin/lib";
use blib './lib/';
use Regexp::CharClasses::Thai;
use Regexp::CharClasses::Thai qw(:all);


binmode STDOUT, ':utf8';

use Test::Simple 'no_plan';

my $failure = 0;

#########################
# TEST ITEMS

ok( q"'ก' =~ /\p{IsThai}/" );
ok( q"'ก' =~ /\p{InThaiCons}/" );
ok( q"'ก' =~ /\p{InThaiMCons}/" );

is( q"'ก' =~ /\p{IsKokai}/",1,' Match for  "ก" =~ /\p{IsKokai}/');
is( q"'ก' =~ /\p{InThai}/",1,' Match for  "ก" =~ /\p{InThai}/');
is( q"'ก' =~ /\p{InThaiAlpha}/",1,' Match for  "ก" =~ /\p{InThaiAlpha}/');
is( q"'ก' =~ /\p{InThaiCons}/",1,' Match for  "ก" =~ /\p{InThaiCons}/');
isnt( q"'ก' =~ /\p{InThaiHCons}/",0,' No match for  "ก" =~ /\p{InThaiHCons}/');
is( q"'ก' =~ /\p{InThaiMCons}/",1,' Match for  "ก" =~ /\p{InThaiMCons}/');
isnt( q"'ก' =~ /\p{InThaiLCons}/",0,' No match for  "ก" =~ /\p{InThaiLCons}/');
isnt( q"'ก' =~ /\p{InThaiDigit}/",0,' No match for  "ก" =~ /\p{InThaiDigit}/');
isnt( q"'ก' =~ /\p{InThaiTone}/",0,' No match for  "ก" =~ /\p{InThaiTone}/');
isnt( q"'ก' =~ /\p{InThaiVowel}/",0,' No match for  "ก" =~ /\p{InThaiVowel}/');
isnt( q"'ก' =~ /\p{InThaiCompVowel}/",0,' No match for  "ก" =~ /\p{InThaiCompVowel}/');
isnt( q"'ก' =~ /\p{InThaiPreVowel}/",0,' No match for  "ก" =~ /\p{InThaiPreVowel}/');
isnt( q"'ก' =~ /\p{InThaiPostVowel}/",0,' No match for  "ก" =~ /\p{InThaiPostVowel}/');
isnt( q"'ก' =~ /\p{InThaiPunct}/",0,' No match for  "ก" =~ /\p{InThaiPunct}/');
is( q"'ก' =~ /\p{InThaiFinCons}/",1,' Match for  "ก" =~ /\p{InThaiFinCons}/');
isnt( q"'ก' =~ /\p{InThaiMute}/",0,' No match for  "ก" =~ /\p{InThaiMute}/');
 

is( q"'ไ' =~ /\p{InThai}/",1,' Match for  "ไ" =~ /\p{InThai}/');
is( q"'ไ' =~ /\p{InThaiAlpha}/",1,' Match for  "ไ" =~ /\p{InThaiAlpha}/');
is( q"'ไ' =~ /\p{InThaiWord}/",1,' Match for  "ไ" =~ /\p{InThaiWord}/');
isnt( q"'ไ' =~ /\p{InThaiCons}/",0,' No match for  "ไ" =~ /\p{InThaiCons}/');
isnt( q"'ไ' =~ /\p{InThaiHCons}/",0,' No match for  "ไ" =~ /\p{InThaiHCons}/');
isnt( q"'ไ' =~ /\p{InThaiMCons}/",0,' No match for  "ไ" =~ /\p{InThaiMCons}/');
isnt( q"'ไ' =~ /\p{InThaiLCons}/",0,' No match for  "ไ" =~ /\p{InThaiLCons}/');
isnt( q"'ไ' =~ /\p{InThaiDigit}/",0,' No match for  "ไ" =~ /\p{InThaiDigit}/');
isnt( q"'ไ' =~ /\p{InThaiTone}/",0,' No match for  "ไ" =~ /\p{InThaiTone}/');
is( q"'ไ' =~ /\p{InThaiVowel}/",1,' Match for  "ไ" =~ /\p{InThaiVowel}/');
isnt( q"'ไ' =~ /\p{InThaiCompVowel}/",0,' No match for  "ไ" =~ /\p{InThaiCompVowel}/');
is( q"'ไ' =~ /\p{InThaiPreVowel}/",1,' Match for  "ไ" =~ /\p{InThaiPreVowel}/');
isnt( q"'ไ' =~ /\p{InThaiPostVowel}/",0,' No match for  "ไ" =~ /\p{InThaiPostVowel}/');
isnt( q"'ไ' =~ /\p{InThaiPunct}/",0,' No match for  "ไ" =~ /\p{InThaiPunct}/');
is( q"'ไ' =~ /\p{IsSaraaimaimalai}/",1,' Match for  "ไ" =~ /\p{IsSaraaimaimalai}/');


    my $pv = 'ข่าวนี้ได้แพร่สะพัดออกไปอย่างรวดเร็ว';
    my $prevowel_syllables = $pv  =~ s/
            (
            (?:\p{InThaiPreVowel})
            (?:
              (?:\p{InThaiDualC1}\p{InThaiDualC2})
              |
              (?:\p{InThaiCons}){1}
            )
            (?:\p{InThaiTone}\p{InThaiCompVowel}\p{InThaiPostVowel}){0,3}
              (?:
                (?:\p{InThaiFinCons}\p{IsYoyak}\p{IsWowaen}){0,5}
                (?!\p{InThaiPostVowel})
              )*
            (?:\p{InThaiMute})?
            )           
            /($1)/gx;

    print "Syllables with pre-vowels in 'ข่าวนี้ได้แพร่สะพัดออกไปอย่างรวดเร็ว' --> $pv: $prevowel_syllables\n";  # 4

if ($prevowel_syllables == 4) { print "Syllables test succeeded.\n\n" } else { print "Syllables test FAILED.\n\n"; $failure++};

if ($failure) {
	print "No success: $failure tests failed.\n";
	exit $failure;
} else {
	print "Success.  All tests passed.\n";
	exit 0;
};

exit;


sub is {
my $test = shift @_;
my $val = shift @_;
my $say = shift @_;
	print "TEST: $say\t";
	if ((eval($test)) == $val) {
		print "Passed in the affirmative.\n" 
	} else { 
		print "FAILED! INCORRECTLY NEGATIVE.\n";
		$failure++ 
	};
};

sub isnt {
my $test = shift @_;
my $val = shift @_;
my $say = shift @_;
	print "TEST: $say\t";
	if (eval($test) != $val) { 
		print "FAILED! INCORRECTLY AFFIRMATIVE.\n";
		$failure++ 
	} else { 
		print "Passed in the negative.\n" 
	};
};

EDIT: Cleaned it up a bit and changed to "pre" tags, hoping for better readability.

Maybe someday I'll figure out the Test::More::UTF8. Until then, this approach will hopefully at least get the module installed.

Blessings,

~Polyglot~


In reply to Re^5: Listing out the characters included in a character class [wide character warning] by Polyglot
in thread Listing out the characters included in a character class by Polyglot

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.