in reply to TagCloud and phrase frequency
As for your second query, what you want to remove are known as "stop words"; use that phrase as googlefodder, and see Lingua::StopWords.