I did something similar yesterday using Lingua::EN::Splitter's words() function.
In reply to Re: The read function and newlines by chromatic in thread The read function and newlines by Adetque