I've studied the demonstration script and I understand everything it's doing, except for this bit:
my $MAX_BYTES = 25; my ($MIN_BPC, $MAX_BPC) = (1, 4); my $MAX_CHARS = $MAX_BYTES / $MIN_BPC;
What's going on here? $MAX_CHARS will always be set to the value of $MAX_BYTES, and $MAX_BPC seems to serve no function. Am I right?
Also, what happens if, in the initial truncation of the string done using substr() as an lvalue, we land smack dab in the middle of a grapheme, and the rightmost character in the resultant truncated string is, by itself, a valid grapheme?
D:\>perl -CO -Mcharnames=:full -wE "$MAX = 4; $cafe = qq/cafe\N{COMBIN +ING ACUTE ACCENT}/; say $cafe; substr($cafe, $MAX) = ''; say $cafe;" +> cafe.txt D:\>
Here's the text in the output file cafe.txt:
café cafe
(Thanks again for this very helpful script!)
In reply to Re^4: Best Way to Get Length of UTF-8 String in Bytes?
by Jim
in thread Best Way to Get Length of UTF-8 String in Bytes?
by Jim
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |