in reply to Re: Re: Cleaning up text for indexing in DB
in thread Cleaning up text for indexing in DB

I decided when working on something similar that a word for me could contain a-z, single-quotes, and hyphens, then had to code around words in single quotes, so it wasn't as simple as /[a-z'-]/.

I think I ended up with

my @words = /(\w[\w'-]*\w|\w+)/g;
or similar.

“Every bit of code is either naturally related to the problem at hand, or else it's an accidental side effect of the fact that you happened to solve the problem using a digital computer.”
M-J D