Items i dont think you're gonna find via this post on PerlMonks.
someone who is conversant with the Japanese script who will evaluate your data file. I would suggest you try with a local Perl Mongers group like Tokyo.pm
a decent perl developer to help you write the code, if you say there are 30 files, with millions of lines in each.
Items for your reference..
You have to take the support of CPAN modules like Unicode::Japanese.
I'm sure monks can point you in the right direction, but for that can you please provide a more accurate dataset in your Original Post, along with some code (or pseudo code/design if you are new)
The Great Programmer is one who inspires others to code, not just one who writes great code