This script counts the number of times a letter comes at the end of a word in a file. The inspiration for this script comes from ambrus's CUFP 779860, Final letter frequency (for french)
Update: removed "use utf8;" (which was not doing what I thought)
#!/usr/bin/perl -w use strict; use warnings; my %wc; while (<>) { while (/(\p{IsAlpha}+)/g) { my $word = $1; my $last = substr $word,-1; print "\ncouneting [$last] for word [$word] "; $wc{lc($last)}++; } } for my $l (sort { $wc{$a} <=> $wc{$b} } keys %wc) { printf "\n%5d %s", $wc{$l}, $l; } print "\n";
This script can be run like this.
./wordcount.pl < ~/dump/27827-8.txtor
cat ~/dump/advsh12.txt | ./wordcount.pl. This is what I see when I run it for "The Kama Sutra of Vatsyayana by Vatsyayana" http://www.gutenberg.org/etext/27827
1 q
1 ### 1 ### 2 ### 2 ### 2 ### 3 j
4 ##### 29 v
33 b
33 x
50 z
116 u
154 c
190 p
249 i
299 w
378 k
1208 m
1231 l
1848 h
2232 a
2347 g
2782 o
3196 y
3510 f
4300 t
4557 r
5713 n
6324 d
8490 s
12606 e
This is what I see for "The Adventures of Sherlock Holmes by Sir Arthur Conan Doyle" http://www.gutenberg.org/etext/1661
1 j
1 ### 2 q
4 v
6 ### 17 z
79 x
90 b
139 c
700 p
1290 w
1455 k
1581 u
1922 m
2597 l
2911 a
2952 h
3034 g
3064 i
3467 f
5038 o
6309 y
6743 r
8317 n
11335 s
11807 t
12068 d
21277 e
In reply to Final letter frequency -- for english by lunatech
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |