in reply to Identifying scripts (writing systems)

Perhaps charscript() and charscripts() from Unicode::UCD could help?

Replies are listed 'Best First'.
Re^2: Identifying scripts (writing systems)
by AppleFritter (Vicar) on Sep 16, 2014 at 22:45 UTC

    Ah, I had a hunch there would be another solution, one that I'd overlooked! Indeed, you're right, the following also works (and is faster to boot):

    use Unicode::UCD qw/charscript/; # ... my %scripts; $scripts{charscript ord substr $_, 0, 1}++ foreach (@lines);

    And it appears that Unicode::UCD was added to the Perl core in 5.8, so I can't even claim victory on that front... on the other hand, I enjoyed working on this, so although I could've saved my time, I didn't strictly waste it. :)