I've been able to use lookup hashes in memory much larger than what you propose without any trouble, and there is nothing faster ( in Perl ) than hash lookup. In fact, I think your idea of using a regex isn't necessary - the plain hash lookup will be your best bet.
module on CPAN - it lets you build a cache of most frequently used words with minimum effort. But I would still try the hash solution before you write too much code - you may be surprised at how little of a memory hit it is.