in reply to statistics of a large text

This is no way addresses your question about optimization, but you could take advantage of Slices to reduce some of your code. For example, replace:
my $cgram = join " ", $unigrams[$i], $unigrams[$i+1], $unigrams[$i+2], +$unigrams[$i+3],$unigrams[$i+4];
with:
my $cgram = "@unigrams[ $i .. $i + 4]";
The double quotes around the array slice also insert a single space between array elements ($").