in reply to TagCloud and phrase frequency

As for your second query, what you want to remove are known as "stop words"; use that phrase as googlefodder, and see Lingua::StopWords.