in reply to Word density
In your case, it's easier to write a regex for what you want (a word) than what you don't want (all between-word sequences):
This matches alphabetic characters followed by an optional apostrophe + alphabetics. This is obviously preliminary. Adjust as necessary according to your definition of a "word" ..my @words = $content =~ m/([A-Za-z]+(?:\'[A-Za-z]+)?)/g
blokhead
|
|---|